CANTANTE: A New Framework for Optimizing LLM-Based Multi-Agent Systems
A recent study introduces CANTANTE, a new framework aimed at improving LLM-based multi-agent systems by addressing the credit-assignment issue. In these systems, it’s tough to tune individual agent settings since performance metrics are only available at the overall system level. CANTANTE solves this by breaking down system rewards into specific signals for each agent, based on comparing different configurations on the same queries. It’s particularly useful for optimizing prompts, treating agent prompts as learnable parameters. When tested on tasks like programming (MBPP), mathematical reasoning (GSM8K), and multi-hop question answering (HotpotQA), CANTANTE surpassed baseline models GEPA and MIPROv2 in average ranking. You can find the paper on arXiv under the identifier 2605.13295.
Key facts
- CANTANTE is a framework for optimizing LLM-based multi-agent systems.
- It addresses the credit-assignment problem in multi-agent systems.
- The framework decomposes system-level rewards into per-agent update signals.
- It contrasts rollouts of multiple joint configurations on the same query.
- CANTANTE is instantiated for prompt optimization.
- Agent prompts are treated as learnable system parameters.
- Evaluated on MBPP, GSM8K, and HotpotQA benchmarks.
- CANTANTE achieves the best average rank against GEPA and MIPROv2.
Entities
Institutions
- arXiv