ImProver 2: AI Framework for Automated Proof Optimization in Lean 4
A new neurosymbolic framework named ImProver 2 has been created for the purpose of automating proof optimization in Lean 4. This innovative system integrates a data-efficient expert-iteration pipeline with a framework that reveals formal structures along with lightweight informal abstractions. Researchers introduced a collection of metrics that capture key structural proof characteristics. ImProver 2 facilitated the training of a 7B-parameter model, which surpasses significantly larger models in the same category and competes effectively with mid-tier frontier models across various metrics. This research tackles issues related to scalable proof optimization in formal mathematics libraries, such as diverse objectives, limited data, and elevated training and inference expenses. The paper can be found on arXiv with ID 2605.22885.
Key facts
- ImProver 2 is a neurosymbolic framework for automated proof optimization in Lean 4.
- It uses a data-efficient expert-iteration pipeline and a scaffold for formal structure and informal abstractions.
- A suite of metrics for structural proof properties was introduced.
- A 7B-parameter model trained with ImProver 2 outperforms much larger models in the same family.
- The model is competitive with mid-tier frontier models.
- The framework addresses heterogeneous objectives, scarce data, and high costs.
- The paper is published on arXiv with ID 2605.22885.
- The work aims to improve maintainability and training data quality for neural provers.
Entities
Institutions
- arXiv