Bayesian Inverse Transition Learning for Model-Based RL from Near-Optimal Trajectories
Researchers propose Inverse Transition Learning, a constraint-based method to estimate transition dynamics from near-optimal expert trajectories in offline model-based reinforcement learning. The approach treats limited coverage as a feature, using near-optimality to inform the estimate of T*. Constraints are integrated into a Bayesian framework. Experiments on synthetic environments and real healthcare scenarios, such as ICU patient management for hypotension, show improved decision-making and the ability to predict transfer success via the posterior.
Key facts
- Method: Inverse Transition Learning
- Estimates transition dynamics T* from near-optimal expert trajectories
- Offline model-based reinforcement learning context
- Uses near-optimality as a feature to inform T* estimate
- Integrates constraints into a Bayesian approach
- Tested on synthetic environments and real healthcare scenarios
- Healthcare scenario: ICU patient management in hypotension
- Demonstrates improved decision-making and transfer success prediction
Entities
—