Requirement-Aware Curriculum RL for LLM Code Generation

ai-technology · 2026-05-04

A new approach called Requirement-Aware Curriculum Reinforcement Learning (RACRL) is proposed to improve code generation by large language models (LLMs). Existing curriculum RL methods suffer from misaligned difficulty perception, lack of requirement difficulty optimization, and suboptimal sampling strategies. RACRL addresses these by incorporating requirement difficulty into the training process. The method uses a requirement-aware difficulty estimator and a curriculum scheduler that adjusts training based on requirement complexity. Experiments on benchmarks like HumanEval and MBPP show that RACRL outperforms baseline CRL methods and fine-tuned LLMs, achieving higher pass rates. The work is published on arXiv (2605.00433) and targets the challenge of increasingly complex programming requirements.

Key facts

arXiv paper 2605.00433 proposes Requirement-Aware Curriculum Reinforcement Learning (RACRL) for LLM code generation.
Existing CRL methods have misaligned requirement difficulty perception, absence of requirement difficulty optimization, and suboptimal curriculum sampling.
RACRL uses a requirement-aware difficulty estimator and curriculum scheduler.
Experiments on HumanEval and MBPP show RACRL outperforms baseline CRL and fine-tuned LLMs.
The method addresses the challenge of increasingly complex programming requirements.
Code generation aims to automatically generate source code from programming requirements.
LLM-based code generation has attracted attention from academia and industry.
The paper is from arXiv, a preprint server.

Requirement-Aware Curriculum RL for LLM Code Generation

Key facts

Entities

Institutions

Sources