ARTFEED — Contemporary Art Intelligence

AI Medical Event Models Show Representation Choices Impact Prediction Accuracy in ICU Data

ai-technology · 2026-04-22

A study published on arXiv (2604.16775v1) demonstrates that tokenization representation significantly influences the predictive performance of generative medical event models, independent of other architectural factors. Researchers conducted three experiments using 28 matched transformer models trained on MIMIC-IV data with a fixed one-epoch pretraining budget, evaluating 30 clinical outcomes. In the first experiment, fused code-value tokenization improved mortality prediction AUROC from 0.891 to 0.915 with statistically significant results (BH-adjusted p < 0.001). The second experiment examined value encoding methods—including hard bins, soft discretization, and code-normalized xVal—crossed with temporal encoding approaches such as event order, time tokens, and admission-relative RoPE. The third experiment compared native MIMIC laboratory and vital codes against CLIF-remapped codes with compression-preserving perturbation arms. The research highlights that clinical event tokenization bounds every prediction from these generative models, yet representation decisions are rarely isolated from system and architectural choices in typical evaluations. The benchmark establishes that representation choices made before training substantially affect downstream prediction accuracy in medical AI applications.

Key facts

  • Study published on arXiv with identifier 2604.16775v1
  • 28 matched transformer models trained on MIMIC-IV data
  • Fixed one-epoch pretraining budget used for all models
  • 30 clinical outcomes evaluated across three experiments
  • Fused code-value tokenization improved mortality AUROC from 0.891 to 0.915
  • BH-adjusted p-value < 0.001 for mortality prediction improvement
  • Experiments examined quantization granularity, reference-range anchoring, and code-value fusion
  • Research compares native MIMIC codes versus CLIF-remapped codes with perturbation arms

Entities

Institutions

  • arXiv

Sources