Shepherd: A Formalized Runtime for Meta-Agent Execution Traces
A new functional programming model named Shepherd has been developed by researchers to formalize meta-agent actions on target agents as functions, with its essential operations implemented in Lean. Shepherd captures each interaction between agents and their environments as a typed event within a Git-like execution trace, allowing for the forking and replaying of any previous state. The system can fork the agent process and its filesystem five times quicker than Docker, achieving over 95% prompt-cache reuse during replay. A live supervisor intervention boosted pair coding pass rates on CooperBench from 28.8% to 54.7%. In counterfactual meta-optimization, branching exploration surpassed baseline performance by as much as 11 points while decreasing wall-clock time by up to 58%. Forking rollouts at specific turns in Tree-RL training enhanced TerminalBench-2 performance from 34.2% to 39.4%. The paper can be accessed on arXiv.
Key facts
- Shepherd is a functional programming model for meta-agent operations.
- Core operations are mechanized in Lean.
- It records agent-environment interactions as typed events in a Git-like execution trace.
- Forking is 5× faster than Docker with >95% prompt-cache reuse on replay.
- Runtime intervention improved CooperBench pass rates from 28.8% to 54.7%.
- Counterfactual meta-optimization outperformed baselines by up to 11 points.
- Wall-clock time reduced by up to 58% in counterfactual meta-optimization.
- Tree-RL training improved TerminalBench-2 performance from 34.2% to 39.4%.
Entities
Institutions
- arXiv
- CooperBench
- TerminalBench-2
- Lean