Segment-Level Learning Improves LLM Theorem Proving in Lean 4
A recent paper on arXiv (2605.11905) introduces segment-level supervision aimed at enhancing the training of LLMs for automated theorem proving using Lean 4. This technique identifies locally coherent proof segments from trajectories, striking a balance between predicting individual tactics and generating complete proofs. The policy models, trained on STP, LeanWorkbook, and NuminaMath-LEAN, demonstrate improved success rates in proving. Additionally, this method is utilized during inference to initiate brief rollouts for current step-level models.
Key facts
- arXiv paper 2605.11905 proposes segment-level supervision for LLM-based theorem proving in Lean 4
- Approach extracts locally coherent proof segments from trajectories
- Middle ground between step-level tactic prediction and whole-proof generation
- Trained on STP, LeanWorkbook, and NuminaMath-LEAN datasets
- Policy models achieve higher proof success rates
- Method reused at inference time for short rollouts
- Revisits supervision granularity as a training set construction problem
- Published on arXiv with Announce Type new
Entities
Institutions
- arXiv
- Lean 4
- STP
- LeanWorkbook
- NuminaMath-LEAN