Hierarchical Adaptive Refinement Accelerates Policy Synthesis in Large MDPs

other · 2026-05-01

A novel strategy for synthesizing policies in Markov decision processes (MDPs) employs hierarchical adaptive refinement to manage extensive state spaces. This technique iteratively identifies the most sensitive areas, optimizing both accuracy and efficiency. The resulting policy is demonstrated to be nearly optimal based on conventional assumptions, with errors limited by local solver tolerances and boundary discrepancies. In practical applications involving MDPs with as many as 1 million states, this method provides up to a 2× increase in speed compared to PRISM, presenting a viable solution for real-world policy synthesis in software-heavy systems such as software product lines and robotics.

Key facts

Approach accelerates policy synthesis in large MDPs via hierarchical adaptive refinement
Dynamically refines MDP by iteratively selecting most fragile regions
Balances accuracy and efficiency by refining only when necessary
Composed policy is near-optimal under standard assumptions
Error bounded by local solver tolerance and boundary mismatch
Demonstrated on MDPs up to 1M states
Achieves up to 2× speedup over PRISM
Applicable to software product lines and robotics

Entities

—

Sources

arXiv cs.AI — 2026-05-01