ARTFEED — Contemporary Art Intelligence

Symmetry in Overparameterized Networks Improves Optimization

other · 2026-04-30

A recent theoretical study published on arXiv (2604.25150) indicates that overparameterization in neural networks creates weight-space symmetries that facilitate optimization. These symmetries function as diagonal preconditioning on the Hessian, leading to better-conditioned minima within sets of functionally similar solutions. Furthermore, overparameterization raises the likelihood of finding global minima close to standard initializations, making them easier to access. Experiments with teacher-student networks demonstrate that increasing width results in a decrease in Hessian trace, improved condition numbers, and faster convergence. This research offers a comprehensive framework for understanding the advantages of overparameterization in optimizing deep learning models.

Key facts

  • Overparameterization introduces additional weight-space symmetries in neural networks.
  • Symmetries act as diagonal preconditioning on the Hessian.
  • Better-conditioned minima exist within each equivalence class of functionally identical solutions.
  • Overparameterization increases the probability mass of global minima near typical initializations.
  • Teacher-student network experiments validate theoretical predictions.
  • As width increases, Hessian trace decreases and condition numbers improve.
  • Convergence accelerates with increased width.
  • The analysis provides a unified framework for understanding overparameterization benefits.

Entities

Institutions

  • arXiv

Sources