Agent Capsules: Adaptive Execution for Multi-Agent LLM Pipelines

ai-technology · 2026-05-04

A new framework called Agent Capsules optimizes multi-agent LLM pipelines by treating execution as an optimization problem with quality constraints. It merges agents into fewer calls to save tokens, but monitors quality degradation from tool loss and prompt compression. The runtime scores composition opportunities, selects among three strategies (standard, two-phase, sequential), and gates mode switches on rolling-mean output quality. A controlled negative result shows that injecting more context worsens compression, so the escalation ladder recovers quality by moving toward per-agent dispatch. The controller uses LLM-judged quality metrics.

Key facts

Multi-agent pipeline with N agents typically issues N LLM calls per run.
Merging agents into fewer calls promises token savings but degrades quality.
Agent Capsules is an adaptive execution runtime.
It treats multi-agent pipeline execution as an optimization problem with empirical quality constraints.
The runtime scores composition opportunity and selects among three compound execution strategies.
Every mode switch is gated on rolling-mean output quality.
A controlled negative result confirms that injecting more context worsens compression.
The escalation ladder moves toward per-agent dispatch to recover quality.

Entities

—

Sources

arXiv cs.AI — 2026-05-04