Committee of Weak AI Models Can Match Stronger Ones
A new arXiv paper (2605.14163) proposes that a committee of weak reasoning language models can achieve performance comparable to much stronger models through verifier-backed committee search. The mechanism involves repeated sampling to amplify coverage, but critics and comparators require a local soundness signal like execution or proof checking. Rank-based bounds show when local selection errors compose into reliable trajectories, and the proposer-side ceiling is characterized by oracle best-of-k converging only to the mass of task solutions.
Key facts
- arXiv paper 2605.14163
- Committee of weak reasoning models can match stronger models
- Verifier-backed committee search as inference-time boosting
- Coverage amplified by repeated sampling
- Critics and comparators need local soundness signal
- Local soundness signals include execution, proof checking, type checking, tests, constraint solving
- Rank-based bounds for reliable trajectories
- Oracle best-of-k converges to mass of task solutions
Entities
Institutions
- arXiv