Clinician Overrides as Implicit Preference Signals for Clinical AI

ai-technology · 2026-05-01

A new arXiv paper proposes reframing clinician overrides of AI recommendations as implicit preference data, similar to RLHF but richer. The authors introduce a five-category override taxonomy, a preference formulation conditioned on patient state, organizational context, and clinician capability, and a dual learning architecture to jointly train reward and capability models. This approach aims to prevent suppression bias, where correct but difficult recommendations are systematically suppressed. The work targets value-based care settings.

Key facts

Clinician overrides of AI recommendations are reframed as implicit preference data.
The signal structure is similar to RLHF but richer due to domain expertise and observable outcomes.
A five-category override taxonomy maps override types to model update targets.
Preference formulation conditions on patient state s, organizational context c, and clinician capability kappa.
Kappa decomposes into execution capability kappa-exec and alignment capability kappa-align.
A dual learning architecture jointly trains a reward model and a capability model via alternating optimization.
The method prevents suppression bias: systematic suppression of correct-but-difficult recommendations.
The paper is published on arXiv with ID 2604.28010.

Clinician Overrides as Implicit Preference Signals for Clinical AI

Key facts

Entities

Institutions

Sources