Delta-Code Generation: A New Approach for LLM-Based Neural Architecture Search
There’s a fresh approach called Delta-Code Generation that’s set to improve large language models (LLMs). Instead of building models from scratch, it generates short code diffs to optimize neural networks, which cuts down on both cost and complexity. The technique fine-tunes LLMs using LoRA on specific architectures from the LEMUR dataset, and it uses MinHash-Jaccard filtering to keep things diverse. Researchers tested three 7B-class LLMs—DeepSeek-Coder-7B, Qwen2.5-Coder-7B, and Mistral-7B—over six datasets, following a 22-cycle protocol with 1,100 candidates each. The results showed all models outperformed the baseline, with DeepSeek-Coder leading with a 75.3% valid rate. You can check out the study on arXiv with ID 2605.04903.
Key facts
- Delta-Code Generation uses fine-tuned LLMs to generate unified diffs for refining baseline architectures.
- The method avoids generating complete model implementations from scratch.
- Pipeline uses LoRA fine-tuning on the LEMUR dataset with MinHash-Jaccard novelty filtering.
- Evaluated three 7B-class LLMs: DeepSeek-Coder-7B, Qwen2.5-Coder-7B, Mistral-7B.
- Tested on six datasets: CIFAR-10, CIFAR-100, MNIST, SVHN, ImageNette, CelebA.
- 22-cycle protocol with 1,100 candidates per LLM.
- Full-generation baseline: 50.6% valid rate, 42.3% mean first-epoch accuracy.
- DeepSeek-Coder achieved 75.3% valid rate and 65.8% mean accuracy.
- Published on arXiv with ID 2605.04903.
Entities
Institutions
- arXiv