Bayesian Inverse Transition Learning for Model-Based RL from Near-Optimal Trajectories

other · 2026-04-30

Researchers propose Inverse Transition Learning, a constraint-based method to estimate transition dynamics from near-optimal expert trajectories in offline model-based reinforcement learning. The approach treats limited coverage as a feature, using near-optimality to inform the estimate of T*. Constraints are integrated into a Bayesian framework. Experiments on synthetic environments and real healthcare scenarios, such as ICU patient management for hypotension, show improved decision-making and the ability to predict transfer success via the posterior.

Key facts

Method: Inverse Transition Learning
Estimates transition dynamics T* from near-optimal expert trajectories
Offline model-based reinforcement learning context
Uses near-optimality as a feature to inform T* estimate
Integrates constraints into a Bayesian approach
Tested on synthetic environments and real healthcare scenarios
Healthcare scenario: ICU patient management in hypotension
Demonstrates improved decision-making and transfer success prediction

Entities

—

Sources

arXiv cs.AI — 2026-04-29