FormalRewardBench: First Benchmark for Reward Models in Formal Theorem Proving

other · 2026-05-12

Researchers have introduced FormalRewardBench, the first benchmark designed to evaluate reward models in formal theorem proving using Lean 4. The benchmark addresses the sparse credit assignment problem in neural theorem provers that rely on reinforcement learning with verifiable rewards (RLVR), where binary correctness signals from proof assistants fail to provide learning signals for partial progress. FormalRewardBench consists of 250 preference pairs, each pairing a correct proof with an incorrect variant generated through five expert-curated error injection strategies: forced mistakes, minimal single-point variations, and verbose incorrect proofs. This benchmark enables comparison of learned reward models without expensive RL training ablations, facilitating progress in automated theorem proving.

Key facts

FormalRewardBench is the first benchmark for reward models in formal theorem proving with Lean 4.
It addresses sparse credit assignment in RLVR-based neural theorem provers.
The benchmark contains 250 preference pairs of correct and incorrect proofs.
Incorrect variants are generated via five error injection strategies.
Strategies include forced mistakes, minimal single-point variations, and verbose incorrect proofs.
It allows evaluation of reward models without expensive RL training ablations.
The work is published on arXiv with ID 2605.10141.
The approach aims to improve learning from partial progress in theorem proving.

FormalRewardBench: First Benchmark for Reward Models in Formal Theorem Proving

Key facts

Entities

Institutions

Sources