Tool-Schema Compression Boosts Agentic RAG Under Tight Context Budgets
A new study on arXiv, titled paper 2605.26165, explores the relationship between tool schema definitions and context window usage in agentic RAG systems. Researchers evaluated 14 language models, with sizes between 1.5B and 32B parameters, conducting 6,566 controlled API tests across three different context budgets: 8K, 16K, and 32K, using 28 tool definitions. They applied TSCG conservative-profile compression, which saved 44-50% in schema tokens. Findings revealed that at 8K tokens, uncompressed JSON-schema definitions couldn't fit, leading to nearly no exact matches (2.6% on average). However, with compression, there was a significant average increase in exact matches of +20.5 pp across all models. This study highlights the crucial role of schema compression for effective retrieval-augmented generation within limited contexts.
Key facts
- Study evaluates 14 language models (1.5B-32B parameters plus one frontier API model)
- 6,566 controlled API calls across three context budgets (8K, 16K, 32K)
- 28 tool definitions used in the evaluation
- TSCG conservative-profile compression achieves 44-50% schema token savings
- At 8K tokens, uncompressed schemas yield 2.6% average exact match
- Compressed schemas provide +20.5 pp average exact-match lift at 8K
- Among six models showing full enablement, lift is +24.7 pp
- At 32K tokens, four of five models show ≤1 pp performance difference
Entities
Institutions
- arXiv