Self-Regulated Simulative Planning for Efficient Agentic Reasoning
A new paper on arXiv (2605.22138) proposes a framework for efficient agentic reasoning by decomposing decision-making into three systems: simulative reasoning (System II) for future-state prediction via a world model, self-regulation (System III) to decide when and how deeply to plan, and reactive execution (System I) for fine-grained action. The approach aims to avoid inefficient token use and unreliable accuracy gains seen in end-to-end trained reactive policies with adaptive computation like chain-of-thought. Simulative reasoning provides unified planning across tasks without per-domain engineering, while self-regulation ensures planning is invoked only when needed.
Key facts
- Paper is on arXiv with ID 2605.22138
- Proposes three-system decomposition: simulative reasoning, self-regulation, reactive execution
- System II grounds deliberation in future-state prediction via a world model
- System III is a learned configurator deciding when and how deeply to plan
- System I handles reactive execution
- Aims to reduce reasoning length and improve token efficiency
- Avoids per-domain engineering for planning
- Contrasts with end-to-end trained reactive policies with chain-of-thought
Entities
Institutions
- arXiv