LLMs Trained on Clinical Notes Predict Patient Events
Researchers have adapted Foresight Learning for clinical predictions by transforming time-sequenced MIMIC-III notes into examples suitable for training large language models. This method generates 6,900 predictive instances from 702 hospital admissions, covering aspects like medications, procedures, organ support, microbiology, and mortality. A compact LoRA adapter, trained on these instances, enhances the performance of the base model, lowering the expected calibration error from 0.1269 to 0.0398 and the Brier score from 0.199 to 0.145. Additionally, it slightly surpasses GPT-5 point estimates on previously unseen questions. This technique allows for the reuse of clinical prediction supervision derived from longitudinal notes without the need for manually crafted structured data.
Key facts
- arXiv:2605.12817v1 describes training LLMs to predict clinical events from MIMIC-III notes.
- Foresight Learning is extended to clinical prediction by converting time-ordered notes into examples.
- Each example consists of past patient context, a natural-language question about a future event, and a label from later documentation.
- 6,900 prediction examples are generated from 702 admissions.
- Predictions cover medications, procedures, organ support, microbiology, and mortality.
- A LoRA adapter reduces expected calibration error from 0.1269 to 0.0398.
- Brier score improves from 0.199 to 0.145.
- The method slightly outperforms GPT-5 point estimates on held-out questions.
Entities
—