Spectral Dynamics in Wide Neural Networks: A Two-Level DMFT Analysis
A new theoretical framework published on arXiv (2605.07870) explores how hidden-weight spectra evolve in wide neural networks trained through gradient descent. The researchers present a two-level dynamical mean-field theory (DMFT) that looks at both the general and outlier spectral behavior in spiked ensembles, where the spike directions correlate with the random bulk. This approach is used in two scenarios: infinite-width nonlinear networks under mean-field/μP scaling and deep linear networks in a proportional high-dimensional context. The framework predicts how outliers evolve with training time, network width, output scale, and initialization variance. It emphasizes the relationship between outlier dynamics and width, shedding light on feature learning in deep networks.
Key facts
- The paper is on arXiv with ID 2605.07870.
- It studies spectral dynamics in wide neural networks.
- A two-level dynamical mean-field theory (DMFT) is developed.
- The framework applies to spiked ensembles with dependent spike directions.
- Two settings are analyzed: infinite-width nonlinear networks (μP scaling) and deep linear networks (proportional limit).
- Outlier evolution depends on training time, width, output scale, and initialization variance.
- In deep linear networks, μP leads to width-consistent outlier dynamics and hyperparameter transfer.
- NTK parameterization shows strongly width-dependent outlier behavior.
Entities
Institutions
- arXiv