SAE Decomposition of Clinical Sequence Model Reveals Feature Complexity and Task Specialisation

ai-technology · 2026-05-07

In their study, researchers utilized sparse autoencoders (SAEs) on FlatASCEND, an autoregressive clinical sequence model with 14.5 million parameters, examining all 10 residual stream extraction points within the INSPECT (outpatient) and MIMIC-IV (ICU) datasets. The decomposition of SAEs reveals a gradual abstraction as transformer layers increase in depth: features from layer-0 act as nearly flawless token detectors (45.7% singleton), whereas layer-6 features encompass around 30 token types across various clinical categories (0.5% singleton). When employing full-sequence simple linear probes, SAE features excel in predicting discrete events (mortality), while dense representations are superior for continuous predictions (length of stay). This phenomenon at the probe level does not apply to clinically significant leakage-safe windows.

Key facts

Sparse autoencoders applied to FlatASCEND, a 14.5M-parameter clinical sequence model
TopK SAEs trained on INSPECT and MIMIC-IV datasets
Layer-0 features are 45.7% singleton token detectors
Layer-6 features span ~30 token types across multiple clinical categories (0.5% singleton)
SAE features outperform dense representations for mortality prediction under full-sequence linear probes
Dense representations outperform SAE features for length of stay prediction
Phenomenon does not extend to leakage-safe windows
Study published on arXiv (2605.04072)

SAE Decomposition of Clinical Sequence Model Reveals Feature Complexity and Task Specialisation

Key facts

Entities

Institutions

Sources