ReaLM-Retrieve: Adaptive Retrieval for Large Reasoning Models
A new framework called ReaLM-Retrieve addresses the misalignment between retrieval-augmented generation (RAG) and large reasoning models like DeepSeek-R1 and OpenAI o1. Current RAG systems provide context before reasoning begins, but reasoning models require evidence injection during multi-step inference. ReaLM-Retrieve introduces a step-level uncertainty detector, a retrieval intervention policy, and an efficiency-optimized integration mechanism that reduces per-retrieval overhead by 3.2x. Experiments on MuSiQue, HotpotQA, and 2WikiMultiHopQA demonstrate its effectiveness.
Key facts
- ReaLM-Retrieve is a reasoning-aware retrieval framework.
- It addresses the mismatch between RAG and large reasoning models.
- Large reasoning models like DeepSeek-R1 and OpenAI o1 generate extended chains of thought.
- Current RAG systems optimize for providing context before reasoning begins.
- ReaLM-Retrieve uses a step-level uncertainty detector.
- It includes a retrieval intervention policy that learns when external evidence benefits reasoning.
- The framework reduces per-retrieval overhead by 3.2x compared to naive integration.
- Experiments were conducted on MuSiQue, HotpotQA, and 2WikiMultiHopQA.
Entities
Institutions
- arXiv