Agent Context Compilation: Training LLMs on Agent Trajectories

ai-technology · 2026-05-23

Agent Context Compilation (ACC) is an innovative technique that transforms agent trajectories from areas such as search, software engineering, and database querying into long-context question-and-answer pairs for the training of large language models. Traditional supervised fine-tuning often overlooks tool responses and focuses solely on turn-level tool selection, resulting in a gap in supervision. ACC mitigates this issue by merging original inquiries with tool outputs and environmental observations over several turns, allowing models to synthesize dispersed evidence. This strategy capitalizes on the extensive trajectories generated by agents while addressing problems, which include tool utilization and observation collection over numerous turns. Consequently, it lessens the reliance on expensive long-document curation or heuristic context synthesis.

Key facts

ACC converts agent trajectories into long-context QA pairs
Standard agent SFT masks tool responses and only trains turn-level tool selection
ACC combines original questions with tool responses and environment observations
Trajectories come from search, software engineering, and database querying agents
The method addresses the supervision blind spot in standard training
Agents produce massive trajectories when solving problems
Evidence needed to answer questions is scattered across multiple turns
ACC reduces need for costly long-document curation or heuristic context synthesis

Entities

—

Sources

arXiv cs.AI — 2026-05-23