ARTFEED — Contemporary Art Intelligence

Research Paper Introduces TDU-OFC Method to Analyze Grokking Transitions in Neural Networks

ai-technology · 2026-04-22

The arXiv paper "Dimensional Criticality at Grokking Across MLPs and Transformers" (identifier 2604.16431v1) presents TDU-OFC (Thresholded Diffusion Update–Olami-Feder-Christensen) as a method to investigate grokking, which refers to the transition from memorization to generalization after reaching peak training accuracy. Utilizing an offline avalanche probe, the research transforms gradient snapshots into cascade statistics, allowing for the extraction of the time-resolved effective cascade dimension D(t). Experiments conducted on Transformers and MLPs (Multilayer Perceptrons) demonstrated a localized intersection with the Gaussian diffusion baseline D=1 during the generalization transition, which varies based on modular addition tasks. This study emphasizes sudden shifts in complex systems, enhancing the comprehension of critical behavior in AI.

Key facts

  • Research paper published on arXiv with identifier 2604.16431v1
  • Introduces TDU-OFC method to analyze grokking transitions in neural networks
  • Grokking describes abrupt transition from memorization to generalization
  • Method converts gradient snapshots into cascade statistics
  • Extracts time-resolved effective cascade dimension D(t)
  • Experiments conducted on Transformers trained on modular addition
  • Experiments conducted on MLPs trained on XOR problems
  • Discovered localized dynamical crossing of Gaussian diffusion baseline D=1 at generalization transition

Entities

Institutions

  • arXiv

Sources