HiMAC: Hierarchical Framework for Long-Horizon LLM Agent Planning
A new hierarchical agentic reinforcement learning framework called HiMAC (Hierarchical Macro-Micro Learning) has been proposed to address limitations in long-horizon decision-making for large language model (LLM) agents. Current flat autoregressive policies, which generate high-level reasoning and low-level actions in a single token sequence, suffer from inefficient exploration and severe error propagation over extended trajectories. HiMAC explicitly decomposes decision-making into macro-level planning and micro-level execution, modeling reasoning as structured blueprint generation followed by goal-conditioned action execution. The framework introduces a critic-free hierarchical policy optimization paradigm to train this hierarchy efficiently. The work is detailed in a paper on arXiv (ID: 2603.00977), published on March 26, 2025, with the announcement type indicating a replacement version. This approach aims to enable robust long-horizon planning within LLM-based agents, potentially improving performance in interactive tasks requiring structured planning and reliable execution.
Key facts
- HiMAC stands for Hierarchical Macro-Micro Learning.
- It is designed for long-horizon LLM agents.
- The framework decomposes decision-making into macro-level planning and micro-level execution.
- It uses a critic-free hierarchical policy optimization paradigm.
- Existing flat autoregressive policies cause inefficient exploration and error propagation.
- The paper is available on arXiv with ID 2603.00977.
- The announcement type is 'replace', indicating a revised version.
- The work was published on March 26, 2025.
Entities
Institutions
- arXiv