ARTFEED — Contemporary Art Intelligence

ReClaim: Foundation Model Trained on 43.8B Medical Events from 200M Patients

other · 2026-05-06

A team of researchers has introduced ReClaim, a generative transformer foundation model meticulously created from scratch, utilizing 43.8 billion medical events sourced from over 200 million participants in MarketScan claims data from 2008 to 2022. This model, which has been expanded to 140 million, 700 million, and 1.7 billion parameters, effectively captures longitudinal patterns in diagnoses, procedures, medications, and costs. In more than 1,000 tasks related to predicting disease onset, ReClaim recorded a mean AUC of 75.6%, significantly surpassing the disease-specific LightGBM (66.3%) and the transformer-based Delphi model. This research, available on arXiv (2605.02740), highlights the untapped potential of administrative claims for healthcare foundation models, with insights from extensive real-world data increasingly shaping regulatory assessments and healthcare decisions.

Key facts

  • ReClaim is a generative transformer trained from scratch on 43.8 billion medical events.
  • Training data comes from over 200 million enrollees in MarketScan claims data (2008-2022).
  • Model scales to 140 million, 700 million, and 1.7 billion parameters.
  • Achieved mean AUC of 75.6% across over 1,000 disease-onset prediction tasks.
  • Outperforms disease-specific LightGBM (66.3%) and transformer-based Delphi model.
  • Administrative claims provide population-scale, longitudinal records of healthcare utilization.
  • Published on arXiv with identifier 2605.02740.
  • Aims to unlock real-world evidence from nationwide medical claims.

Entities

Institutions

  • MarketScan
  • arXiv

Sources