MM-OptBench: New Benchmark for Multimodal Optimization Modeling

ai-technology · 2026-05-13

Researchers have introduced MM-OptBench, a new benchmark aimed at solver-driven multimodal optimization modeling. This approach translates real-world decision-making situations into mathematical models and executable code for solvers. While language models are becoming more popular, existing benchmarks primarily focus on text and neglect visual elements like tables, graphs, maps, schedules, and dashboards. MM-OptBench pushes models to create both mathematical formulations and executable code from a combined text-and-visual problem description. It establishes organized optimization scenarios, validates each with an exact solver, and produces model inputs along with hidden reference files, addressing a crucial need in utilizing visual data in operational practices.

Key facts

MM-OptBench is a solver-grounded benchmark for multimodal optimization modeling.
Existing optimization modeling benchmarks are almost entirely text-only.
Multimodal optimization modeling includes visual artifacts such as tables, graphs, maps, schedules, and dashboards.
Models must construct both a mathematical formulation and executable solver code.
The framework generates structured optimization instances and verifies them with an exact solver.
The benchmark builds both model-facing inputs and hidden reference files.
Language models are increasingly used to generate optimization formulations and solver code.
The benchmark addresses tasks that arise in operational practice.

Entities

—

Sources

arXiv cs.AI — 2026-05-13