Co-ReAct: Rubric-Guided Action Selection for ReAct Agents
A new framework called Co-ReAct uses rubrics as step-level guidance during inference to improve ReAct-style agents in search-intensive, multi-step reasoning tasks. Unlike prior uses of rubrics as training-time rewards or post-hoc evaluators, Co-ReAct injects a rubric into the agent's context at each decision step to guide the next Reason-or-Act decision, specifying targets for evidence seeking, search, reasoning, or self-evaluation. This addresses issues of shallow, redundant, or poorly targeted trajectories in existing agents. The paper is available on arXiv with ID 2605.23590.
Key facts
- Co-ReAct is a rubric-guided action-selection framework for ReAct agents.
- Rubrics are used as step-level guidance during inference.
- Prior work used rubrics as training-time rewards or post-hoc evaluators.
- Co-ReAct injects a rubric into the agent's context at each decision step.
- The rubric guides the next Reason-or-Act decision.
- It specifies targets for evidence seeking, search, reasoning, or self-evaluation.
- The approach addresses shallow, redundant, or poorly targeted trajectories.
- The paper is on arXiv with ID 2605.23590.
Entities
Institutions
- arXiv