One-Step-Train: Efficient Data Selection for Multimodal Models
Researchers have unveiled a new approach called One-Step-Train (OST), which reshapes how we select data by framing it as a challenge in incremental optimization for large multimodal models (LMMs). Unlike previous methods, such as LLM-as-a-Judge, which are expensive and hard to understand, OST assesses the added value of each sample using a simulated single-step update with a straightforward proxy. Tests on the Qwen series across different multimodal math reasoning benchmarks show that OST achieves Pareto-optimal efficiency. By focusing on the top-50 samples, it reduces training costs by 43% and saves 17% in time, while exceeding the LLM-as-a-Judge benchmark by 1.8 points, effectively addressing the balance between quality and quantity in synthetic data.
Key facts
- OST reformulates data selection as incremental optimization utility ranking
- OST uses simulated single-step update on lightweight proxy
- Experiments on Qwen series across multimodal mathematical reasoning benchmarks
- Top-50 subset reduces training costs by 43%
- Total time consumption reduced by 17%
- Outperforms LLM-as-a-Judge baseline by 1.8 points
- Addresses quality-quantity trade-off in synthetic data
- Proposed framework is interpretable and computationally efficient
Entities
Institutions
- arXiv