HiMAC: Hierarchical Framework for Long-Horizon LLM Agent Planning

ai-technology · 2026-05-07

A new hierarchical agentic reinforcement learning framework called HiMAC (Hierarchical Macro-Micro Learning) has been proposed to address limitations in long-horizon decision-making for large language model (LLM) agents. Current flat autoregressive policies, which generate high-level reasoning and low-level actions in a single token sequence, suffer from inefficient exploration and severe error propagation over extended trajectories. HiMAC explicitly decomposes decision-making into macro-level planning and micro-level execution, modeling reasoning as structured blueprint generation followed by goal-conditioned action execution. The framework introduces a critic-free hierarchical policy optimization paradigm to train this hierarchy efficiently. The work is detailed in a paper on arXiv (ID: 2603.00977), published on March 26, 2025, with the announcement type indicating a replacement version. This approach aims to enable robust long-horizon planning within LLM-based agents, potentially improving performance in interactive tasks requiring structured planning and reliable execution.

Key facts

HiMAC stands for Hierarchical Macro-Micro Learning.
It is designed for long-horizon LLM agents.
The framework decomposes decision-making into macro-level planning and micro-level execution.
It uses a critic-free hierarchical policy optimization paradigm.
Existing flat autoregressive policies cause inefficient exploration and error propagation.
The paper is available on arXiv with ID 2603.00977.
The announcement type is 'replace', indicating a revised version.
The work was published on March 26, 2025.

HiMAC: Hierarchical Framework for Long-Horizon LLM Agent Planning

Key facts

Entities

Institutions

Sources