RACER: Adaptive Routing for Cost-Efficient LLM-as-a-Judge
A recent preprint on arXiv (2605.10805) presents RACER, a framework designed for Robust Adaptive Cost-Efficient Routing in LLM-as-a-Judge systems. This research evaluates the performance of reasoning-capable large language models (LLMs) in comparison to non-reasoning judges across various tasks. Findings indicate that while explicit reasoning enhances accuracy in structured verification tasks, such as mathematics and coding, it yields minimal or adverse effects on simpler assessments and results in significantly higher computational expenses. RACER intelligently chooses between reasoning and non-reasoning judges within a predetermined budget, framing the routing challenge as a constrained distributionally robust optimization problem that incorporates distribution shifts through KL-divergence uncertainty sets. The goal is to apply reasoning selectively instead of universally.
Key facts
- arXiv preprint 2605.10805 introduces RACER
- Reasoning LLMs improve accuracy on math and coding tasks
- Reasoning offers limited gains on simpler evaluations
- Reasoning incurs significantly higher computational cost
- RACER dynamically selects judges under fixed budget
- Routing formulated as constrained distributionally robust optimization
- Accounts for distribution shift via KL-divergence
- Proposes selective use of reasoning
Entities
Institutions
- arXiv