ARTFEED — Contemporary Art Intelligence

Model Merging: Combining Neural Networks in Weight Space

publication · 2026-05-06

A new thesis on arXiv (2605.01580) proposes model merging as an alternative to training separate neural networks. The approach combines independently trained networks directly in weight space without requiring original training data or extensive optimization. In the single-task setting, the thesis introduces C$^2$M$^3$, a cycle-consistent merging algorithm based on Frank-Wolfe optimization that aligns multiple networks into a shared parameter space. For multi-task settings, where models are fine-tuned from a common initialization, a theoretical framework is developed. The work challenges the conventional paradigm of treating models as isolated artifacts.

Key facts

  • Thesis on arXiv with ID 2605.01580
  • Proposes model merging as an alternative paradigm
  • Combines neural networks directly in weight space
  • No access to original training data required
  • Introduces C$^2$M$^3$ algorithm for single-task merging
  • C$^2$M$^3$ uses Frank-Wolfe optimization
  • Covers both single-task and multi-task regimes
  • Multi-task setting assumes common pretrained initialization

Entities

Institutions

  • arXiv

Sources