Co-ReAct: Rubric-Guided Action Selection for ReAct Agents

other · 2026-05-25

A new framework called Co-ReAct uses rubrics as step-level guidance during inference to improve ReAct-style agents in search-intensive, multi-step reasoning tasks. Unlike prior uses of rubrics as training-time rewards or post-hoc evaluators, Co-ReAct injects a rubric into the agent's context at each decision step to guide the next Reason-or-Act decision, specifying targets for evidence seeking, search, reasoning, or self-evaluation. This addresses issues of shallow, redundant, or poorly targeted trajectories in existing agents. The paper is available on arXiv with ID 2605.23590.

Key facts

Co-ReAct is a rubric-guided action-selection framework for ReAct agents.
Rubrics are used as step-level guidance during inference.
Prior work used rubrics as training-time rewards or post-hoc evaluators.
Co-ReAct injects a rubric into the agent's context at each decision step.
The rubric guides the next Reason-or-Act decision.
It specifies targets for evidence seeking, search, reasoning, or self-evaluation.
The approach addresses shallow, redundant, or poorly targeted trajectories.
The paper is on arXiv with ID 2605.23590.

Co-ReAct: Rubric-Guided Action Selection for ReAct Agents

Key facts

Entities

Institutions

Sources