POMDP Policy Robustness to Observation Perturbations Studied

other · 2026-04-25

A new paper on arXiv (2604.21256) introduces the Policy Observation Robustness Problem for Partially Observable Markov Decision Processes (POMDPs). The work analyzes how deviations in the observation model affect policy performance, considering sticky (state-action dependent) and non-sticky (history-dependent) variants. The problem is formulated as a bi-level optimization where the inner optimization is monotonic in deviation size.

Key facts

Paper introduces Policy Observation Robustness Problem for POMDPs
Studies deviations in observation model
Two variants: sticky and non-sticky
Formulated as bi-level optimization problem
Inner optimization monotonic in deviation size
Published on arXiv with ID 2604.21256
Focuses on robustness to calibration drift or sensor degradation
Determines maximum tolerable deviation for guaranteed value threshold

POMDP Policy Robustness to Observation Perturbations Studied

Key facts

Entities

Institutions

Sources