ARTFEED — Contemporary Art Intelligence

BoostAPR: RL-Based Automated Program Repair with Dual Rewards

other · 2026-05-12

BoostAPR is a groundbreaking system enhancing automated program repair through execution-grounded reinforcement learning. It operates in three key stages: initial fine-tuning with execution-validated examples, creation of reward models based on sequential and line-level insights, and utilization of Proximal Policy Optimization (PPO) to emphasize important rewards at the line level. In tests conducted on the SWE-Gym platform, BoostAPR recorded impressive performance metrics, achieving 40.7% accuracy on SWE-bench Verified (an improvement of 22.9 percentage points), 24.8% on Defects4J (Python-to-Java), 84.5% on HumanEval-Java, and 95.0% on QuixBugs, demonstrating its versatility across programming languages.

Key facts

  • BoostAPR is a three-stage framework for automated program repair
  • Uses execution-grounded reinforcement learning with dual reward models
  • Includes sequence-level assessor and line-level credit allocator
  • Trained on SWE-Gym and evaluated on four benchmarks
  • Achieves 40.7% on SWE-bench Verified
  • Achieves 24.8% on Defects4J (Python-to-Java transfer)
  • Achieves 84.5% on HumanEval-Java
  • Achieves 95.0% on QuixBugs

Entities

Sources