PSPACE-Completeness of Multi-Environment POMDPs

other · 2026-05-11

A new paper establishes that computing optimal values and policies in multi-environment partially observable Markov decision processes (MEPOMDPs) with finite-horizon objectives is PSPACE-complete. This extends the known PSPACE-completeness result for standard POMDPs to the more general MEPOMDP setting, where the initial state is unknown and adversarially chosen. The authors also present a practical algorithm that significantly outperforms the only previously known algorithm on classical benchmarks. The work is published on arXiv under the Computer Science > Artificial Intelligence category.

Key facts

MEPOMDPs extend POMDPs with an adversarially chosen initial state.
The problem of computing optimal value and policy in MEPOMDPs with finite-horizon objectives is shown to be PSPACE-complete.
A practical algorithm is presented and evaluated on classical benchmarks.
The new algorithm significantly outperforms the only previously known algorithm.
The paper is listed under Computer Science > Artificial Intelligence on arXiv.
The arXiv ID is 2605.07537.

PSPACE-Completeness of Multi-Environment POMDPs

Key facts

Entities

Institutions

Sources