AutoOR: AI Pipeline Trains LLMs to Autoformalize Operations Research Problems
AutoOR has launched a scalable pipeline for generating synthetic data and reinforcement learning, designed to train large language models to convert natural language descriptions into formalized optimization problems. This innovative system produces verified training data from conventional optimization formats and utilizes feedback from solver execution as a reward signal during reinforcement learning post-training. Optimization challenges are crucial for decision-making in sectors such as manufacturing, logistics, and scheduling. Traditionally, transforming complex problem descriptions into solver-ready formats necessitates specialized knowledge in operations research, hindering scalability. When applied to an 8B model, AutoOR achieves competitive or state-of-the-art results on six recognized OR benchmarks, rivaling much larger frontier models. Notably, it excels in a non-linear problem class related to physical dynamics, where frontier models perform poorly. The methodology encompasses linear, mixed-integer, and non-linear optimization types, aiming to simplify the optimization problem-solving process by automating formalization that typically demands expert skills.
Key facts
- AutoOR trains LLMs to autoformalize optimization problems from natural language
- Uses scalable synthetic data generation and reinforcement learning pipeline
- Generates verified training data from standard optimization forms
- Uses solver execution feedback as reward signal for RL post-training
- Applied to 8B model achieves state-of-the-art results across six OR benchmarks
- Matches performance of significantly larger frontier models
- Particularly effective for non-linear problems involving physical dynamics
- Covers linear, mixed-integer, and non-linear optimization categories
Entities
—