Single LLM System Optimizes Text Across Six Diverse Domains
The innovative AI system, 'optimize_anything', sets new benchmarks in six varied optimization tasks by redefining problems as enhancements of a text artifact assessed through a scoring function. It accommodates both single-task and multi-task searches, enabling cross-problem transfer and generalization to new inputs. This system boosts Gemini Flash's ARC-AGI accuracy significantly, soaring from 32.5% to 89.5%. Additionally, it discovers scheduling algorithms that reduce cloud expenses by 40%, produces CUDA kernels with an 87% success rate against PyTorch, and surpasses AlphaEvolve's circle packing solution for n=26. Studies in three domains reveal that incorporating actionable side information leads to quicker convergence and superior final scores compared to score-only feedback, while multi-task search proves more effective than independent optimization.
Key facts
- Single LLM-based optimization system matches specialized tools across six domains
- Supports single-task search, multi-task search with cross-problem transfer, and generalization to unseen inputs
- Triples Gemini Flash's ARC-AGI accuracy from 32.5% to 89.5%
- Finds scheduling algorithms that cut cloud costs by 40%
- Generates CUDA kernels where 87% match or beat PyTorch
- Outperforms AlphaEvolve's reported circle packing solution for n=26
- Actionable side information yields faster convergence and higher final scores than score-only feedback
- Multi-task search outperforms independent optimization
Entities
Institutions
- arXiv
- Gemini
- AlphaEvolve
- PyTorch