From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model
Summary
A novel text-based survival modeling pipeline demonstrates the successful transfer of time-to-event risk information from a Cox proportional hazards model into a generative large language model. Researchers fine-tuned a Qwen-based LLM to generate patient-specific survival risk, using predictions from a Cox model as the training target, after converting structured clinical covariates into text prompts. Across the GBSG2, ACTG320, and WHAS500 datasets, the model achieved competitive held-out discrimination and calibration, despite being trained as a text-generation task rather than with a conventional survival-analysis loss. Further analysis of the model's hidden states, via t-SNE visualizations, revealed smooth risk gradients in latent space, indicating the model represents survival risk as a continuous structure. These findings suggest that large language models can internalize complex survival-risk structures and support calibrated predictions, opening a path for time-to-event reasoning capabilities in LLMs.
Key takeaway
For machine learning engineers developing clinical prediction models, this research suggests you can effectively distill complex survival risk information from established Cox models into generative LLMs. You should consider text-based fine-tuning of models like Qwen, using Cox predictions as targets, to enable time-to-event reasoning within your language-based systems. This approach offers a path to integrate robust survival analysis directly into conversational or text-driven AI applications.
Key insights
Large language models can internalize and predict survival risk from Cox models via text-based distillation.
Principles
- LLMs can learn continuous risk gradients.
- Text generation can model survival outcomes.
- Cox model predictions serve as effective LLM targets.
Method
Convert structured clinical covariates to text prompts. Fine-tune a Qwen-based LLM using Cox model predictions as the training target for text generation.
In practice
- Distill traditional survival models into LLMs.
- Integrate time-to-event reasoning into LLM applications.
- Visualize latent space for risk representation.
Topics
- Survival Analysis
- Large Language Models
- Cox Models
- Text Generation
- Clinical Prediction
- Latent Space
Best for: NLP Engineer, AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.