FFT-Diagonalized Neural Network Layers Reduce Parameters
A recent study published on arXiv (2605.08171) explores the application of the Communication Dynamics (CD) framework, which was initially developed for predicting atomic energy and understanding field-induced superconductivity, to the design of neural networks. The proposed CDLinear layer functions as a block-circulant linear layer with a block size of B = 2l+1, utilizing just 1/B of the parameters found in a traditional dense layer. The discrete Fourier transform diagonalizes the Hessian of the mean-squared loss, with eigenvalues derived directly from input statistics. When input pre-whitening is applied, the population Hessian condition number is 1, while the empirical condition number is limited to 1+O(sqrt(B/N)). This method enhances training efficiency and minimizes the number of parameters.
Key facts
- arXiv paper 2605.08171 introduces CDLinear layer
- CD framework originally for atomic-energy and superconductivity
- CDLinear is block-circulant with block size B = 2l+1
- Parameter count is 1/B of dense layer
- Hessian diagonalized by discrete Fourier transform
- Eigenvalues are |F[Xj](k)|^2 from input statistics
- Population Hessian condition number = 1 under pre-whitening
- Empirical condition number bounded by 1+O(sqrt(B/N))
Entities
Institutions
- arXiv