Empirical Study Tests Repulsion Theorem in Neural Network Grokking

other · 2026-05-12

A new empirical study investigates the repulsion theorem proposed by Tian (2025) for two-layer neural networks undergoing grokking, a phenomenon where networks suddenly generalize after prolonged overfitting. The theorem predicts that during interactive feature learning, similar features repel each other, leaving a spectral signature in parameter updates. The study tests this on Tian's modular addition setup (M=71, K=2048, MSE loss) and finds a structure-mechanism dissociation: the predicted sign rule holds robustly for top-200 most-similar feature pairs, with empirical sign-match rising from 0.865 to 0.985 on σ=x² across 5 seeds and saturating at 1.000 on σ=ReLU. However, the spectral signature in parameter updates is strongly activation-dependent, indicating that the repulsion mechanism is not always empirically observable. The work was published on arXiv with ID 2605.08119.

Key facts

Tian (2025) proved a repulsion theorem for the matrix B = (F̃ᵀF̃ + ηI)⁻¹ during grokking.
The theorem predicts negative off-diagonal entries Bⱼₗ for similar features, causing repulsion.
The study tests the theorem on Tian's modular addition setup with M=71, K=2048, MSE loss.
Empirical sign-match for top-200 similar feature pairs rose from 0.865 to 0.985 on σ=x² across 5 seeds.
Empirical sign-match saturated at 1.000 on σ=ReLU.
The spectral signature in parameter updates is strongly activation-dependent.
The study reveals a structure-mechanism dissociation in the repulsion effect.
The paper was published on arXiv with ID 2605.08119.

Empirical Study Tests Repulsion Theorem in Neural Network Grokking

Key facts

Entities

Institutions

Sources