ARTFEED — Contemporary Art Intelligence

SCRL: Curriculum RL Enables Credit Assignment for LLM Reasoning

other · 2026-05-23

Researchers introduce SCRL (Subproblem Curriculum Reinforcement Learning), a framework that improves LLM reasoning by breaking hard problems into verifiable subproblems. Unlike standard outcome-based RLVR, which struggles with rare correct rollouts and cannot leverage partial progress, SCRL derives subproblems from reference reasoning chains and uses subproblem-level normalization to assign finer-grained credit without external rubrics. This approach turns partial progress into learning signals, lifting hard problems out of gradient dead zones.

Key facts

  • SCRL stands for Subproblem Curriculum Reinforcement Learning.
  • It addresses inefficiency of outcome-based RLVR on hard problems.
  • Derives verifiable subproblems from reference reasoning chains.
  • Fixes the final subproblem as the original problem.
  • Uses subproblem-level normalization for finer-grained credit assignment.
  • No external rubrics or reward models are needed.
  • Analysis shows subproblem curricula lift hard problems out of gradient dead zones.
  • Published on arXiv with ID 2605.22074.

Entities

Institutions

  • arXiv

Sources