PSPACE-Completeness of Multi-Environment POMDPs
A new paper establishes that computing optimal values and policies in multi-environment partially observable Markov decision processes (MEPOMDPs) with finite-horizon objectives is PSPACE-complete. This extends the known PSPACE-completeness result for standard POMDPs to the more general MEPOMDP setting, where the initial state is unknown and adversarially chosen. The authors also present a practical algorithm that significantly outperforms the only previously known algorithm on classical benchmarks. The work is published on arXiv under the Computer Science > Artificial Intelligence category.
Key facts
- MEPOMDPs extend POMDPs with an adversarially chosen initial state.
- The problem of computing optimal value and policy in MEPOMDPs with finite-horizon objectives is shown to be PSPACE-complete.
- A practical algorithm is presented and evaluated on classical benchmarks.
- The new algorithm significantly outperforms the only previously known algorithm.
- The paper is listed under Computer Science > Artificial Intelligence on arXiv.
- The arXiv ID is 2605.07537.
Entities
Institutions
- arXiv