POMDP Policy Robustness to Observation Perturbations Studied
A new paper on arXiv (2604.21256) introduces the Policy Observation Robustness Problem for Partially Observable Markov Decision Processes (POMDPs). The work analyzes how deviations in the observation model affect policy performance, considering sticky (state-action dependent) and non-sticky (history-dependent) variants. The problem is formulated as a bi-level optimization where the inner optimization is monotonic in deviation size.
Key facts
- Paper introduces Policy Observation Robustness Problem for POMDPs
- Studies deviations in observation model
- Two variants: sticky and non-sticky
- Formulated as bi-level optimization problem
- Inner optimization monotonic in deviation size
- Published on arXiv with ID 2604.21256
- Focuses on robustness to calibration drift or sensor degradation
- Determines maximum tolerable deviation for guaranteed value threshold
Entities
Institutions
- arXiv