RACER: Adaptive Routing for Cost-Efficient LLM-as-a-Judge

ai-technology · 2026-05-12

A recent preprint on arXiv (2605.10805) presents RACER, a framework designed for Robust Adaptive Cost-Efficient Routing in LLM-as-a-Judge systems. This research evaluates the performance of reasoning-capable large language models (LLMs) in comparison to non-reasoning judges across various tasks. Findings indicate that while explicit reasoning enhances accuracy in structured verification tasks, such as mathematics and coding, it yields minimal or adverse effects on simpler assessments and results in significantly higher computational expenses. RACER intelligently chooses between reasoning and non-reasoning judges within a predetermined budget, framing the routing challenge as a constrained distributionally robust optimization problem that incorporates distribution shifts through KL-divergence uncertainty sets. The goal is to apply reasoning selectively instead of universally.

Key facts

arXiv preprint 2605.10805 introduces RACER
Reasoning LLMs improve accuracy on math and coding tasks
Reasoning offers limited gains on simpler evaluations
Reasoning incurs significantly higher computational cost
RACER dynamically selects judges under fixed budget
Routing formulated as constrained distributionally robust optimization
Accounts for distribution shift via KL-divergence
Proposes selective use of reasoning

RACER: Adaptive Routing for Cost-Efficient LLM-as-a-Judge

Key facts

Entities

Institutions

Sources