Contextual Bandit Preference Learning for Ventilator Decision Support

ai-technology · 2026-05-25

A newly developed multi-agent framework for ventilator decision support employs contextual bandit preference learning to align with the unique tuning styles of individual clinicians. The Ventilator Decision Support System (VDSS) integrates modular decision components via contract-driven structured interfaces, ensuring traceable evidence for review. It adapts preferences in real-time by modifying clinician-specific choices based on the final accepted decision during each adjustment cycle. Targeted replanning is initiated by structured rejection feedback, which aims to minimize unproductive iterations and enhance interaction stability. This system overcomes the limitations of rule-based methods that seldom achieve personalization, as well as the challenges of controlling and auditing end-to-end reinforcement learning or extensive language model systems. Details of the framework can be found in a paper on arXiv (2605.23320).

Key facts

Proposed system: Ventilator Decision Support System (VDSS)
Uses contextual bandit for online preference adaptation
Multi-agent framework with modular decision components
Contract-driven structured interfaces for coordination
Produces traceable evidence for review
Structured rejection feedback triggers targeted replanning
Addresses limitations of rule-based and end-to-end RL/LLM systems
Paper available on arXiv: 2605.23320

Contextual Bandit Preference Learning for Ventilator Decision Support

Key facts

Entities

Institutions

Sources