CAPS: Efficient Parallel Reasoning via Adaptive Pairwise Selection
A novel framework known as CAPS (Cascaded Adaptive Pairwise Selection) aims to lower the computational demands of pairwise self-verification in extensive language models. Unlike conventional techniques that conduct numerous full-length pairwise evaluations without considering their relevance, CAPS employs a non-uniform distribution of verifier resources along two dimensions: an evidence axis that determines the extent of each candidate viewed by the judge, and a distribution axis that manages how comparisons are allocated among the candidates. This framework features a four-stage cascade and includes an optional rescue subroutine, providing a closed-form verifier-token cost for each candidate. The research is available on arXiv with the identifier 2605.15513.
Key facts
- CAPS stands for Cascaded Adaptive Pairwise Selection.
- It is an inference-only framework for parallel reasoning in LLMs.
- It addresses the high cost of pairwise self-verification.
- Compute allocation is non-uniform along evidence and distribution axes.
- The framework uses a four-stage cascade with an optional rescue subroutine.
- It admits a closed-form verifier-token cost per candidate.
- The paper is on arXiv: 2605.15513.
- The method aims to improve efficiency of test-time scaling.
Entities
Institutions
- arXiv