BET: A Budget-Efficient Thinking Framework for Adaptive Reasoning in LRMs
Researchers propose Budget-Efficient Thinking (BET), a two-stage framework to optimize test-time compute in large reasoning models (LRMs). BET addresses the misallocation of computational budgets by considering solvability, not just perceived difficulty. It combines behavioral cold-start with GRPO under an investment-cost-aware reward, learning three behaviors: short solve, long solve, and fold. The approach aims to reduce costs while maintaining accuracy on solvable queries.
Key facts
- Large reasoning models (LRMs) often misallocate test-time compute.
- Existing efficiency methods overlook solvability.
- BET formulates adaptive reasoning as computational investment under uncertainty.
- BET uses a two-stage framework: behavioral cold-start and GRPO.
- The reward is investment-cost-aware.
- BET learns three behaviors: short solve, long solve, and fold.
- The goal is to reduce cost without sacrificing accuracy on solvable queries.
- The paper is arXiv:2605.11625v1.
Entities
Institutions
- arXiv