BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

other · 2026-05-20

arXiv:2605.20084v1 introduces BalanceRAG, a method for calibrating cascaded retrieval-augmented generation (RAG) systems. In cascaded RAG, each query is first processed by an LLM-only branch; if uncertain, it escalates to a RAG fallback, and abstains if neither branch is trustworthy. BalanceRAG jointly calibrates threshold pairs for LLM-only and RAG branches using sequential graphical testing on a two-dimensional lattice, enabling risk-adaptive threshold calibration that controls system-level error rates. This approach addresses the conservatism of stage-by-stage calibration by considering joint uncertainty. The work is published on arXiv under ID 2605.20084.

Key facts

BalanceRAG is a method for joint risk calibration in cascaded RAG systems.
Cascaded RAG uses an LLM-only branch first, then a RAG fallback if uncertain.
BalanceRAG frames threshold pairs as operating points on a two-dimensional lattice.
It uses sequential graphical testing to identify safe operating points.
The method controls system-level error rates among the branches.
It addresses conservatism of stage-by-stage calibration.
The paper is on arXiv with ID 2605.20084.
The approach enables risk-adaptive threshold calibration.

BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

Key facts

Entities

Institutions

Sources