Latent Algorithmic Structure Precedes Grokking: A Mechanistic Study of ReLU MLPs on Modular Arithmetic
arXiv:2603.23784v1 Announce Type: new Abstract: Grokking-the phenomenon where validation accuracy of neural networks on modular addition of two integers rises long after training data has been memorized-has been characterized in previous works as producing sinusoidal input weight distributions in transformers and multi-layer perceptrons (MLPs). We find empirically that ReLU MLPs in our experimental setting instead learn near-binary square wave input weights, where intermediate-valued weights appear exclusively near sign-change boundaries, alongside output weight distributions whose dominant Fourier phases satisfy a phase-sum relation $\phi_{\mathrm{out}} = \phi_a + \phi_b$; this relation holds even when the model is trained on noisy data and fails to grok. We extract the frequency and phase of each neuron's weights via DFT and construct an idealized MLP: Input weights are replaced by perfect binary square waves and output weights by cosines, both parametrized by the frequencies, phase
arXiv:2603.23784v1 Announce Type: new Abstract: Grokking-the phenomenon where validation accuracy of neural networks on modular addition of two integers rises long after training data has been memorized-has been characterized in previous works as producing sinusoidal input weight distributions in transformers and multi-layer perceptrons (MLPs). We find empirically that ReLU MLPs in our experimental setting instead learn near-binary square wave input weights, where intermediate-valued weights appear exclusively near sign-change boundaries, alongside output weight distributions whose dominant Fourier phases satisfy a phase-sum relation $\phi_{\mathrm{out}} = \phi_a + \phi_b$; this relation holds even when the model is trained on noisy data and fails to grok. We extract the frequency and phase of each neuron's weights via DFT and construct an idealized MLP: Input weights are replaced by perfect binary square waves and output weights by cosines, both parametrized by the frequencies, phases, and amplitudes extracted from the dominant Fourier components of the real model weights. This idealized model achieves 95.5% accuracy when the frequencies and phases are extracted from the weights of a model trained on noisy data that itself achieves only 0.23% accuracy. This suggests that grokking does not discover the correct algorithm, but rather sharpens an algorithm substantially encoded during memorization, progressively binarizing the input weights into cleaner square waves and aligning the output weights, until generalization becomes possible.
Executive Summary
This study sheds new light on the phenomenon of grokking in neural networks by examining the latent algorithmic structure of ReLU multi-layer perceptrons (MLPs) on modular arithmetic. The authors found that these networks learn near-binary square wave input weights and output weight distributions that satisfy a phase-sum relation, even when trained on noisy data. The study's idealized model demonstrates that grokking does not discover the correct algorithm, but rather sharpens a substantially encoded algorithm during memorization. This research has significant implications for understanding neural network generalization and highlights the importance of latent structure in the learning process.
Key Points
- ▸ ReLU MLPs learn near-binary square wave input weights
- ▸ Output weight distributions satisfy a phase-sum relation
- ▸ Grokking is not discovering the correct algorithm, but rather sharpening a pre-existing one
Merits
Strength in Identifying Latent Structure
The study successfully extracted the frequency and phase of each neuron's weights via DFT and constructed an idealized MLP, providing valuable insights into the latent structure of neural networks.
Contributions to Understanding Generalization
The research highlights the importance of latent structure in the learning process and demonstrates that grokking is not a discovery of the correct algorithm, but rather a refinement of a pre-existing one.
Demerits
Limitation in Generalizability
The study's findings may not be generalizable to other neural network architectures or tasks, and further research is needed to confirm the results.
Complexity of Idealized Model
The idealized model, although successful in achieving high accuracy, may be overly simplistic and may not accurately reflect the complexities of real-world neural networks.
Expert Commentary
This study makes a significant contribution to the field of neural networks by providing a detailed analysis of the latent algorithmic structure of ReLU MLPs on modular arithmetic. The authors' use of DFT to extract the frequency and phase of each neuron's weights provides valuable insights into the learning process and highlights the importance of latent structure in generalization. The study's idealized model, although complex, demonstrates the potential for significant improvements in neural network performance through the refinement of pre-existing algorithms. Further research is needed to confirm the generalizability of the study's findings and to explore the implications of latent structure in other neural network architectures and tasks.
Recommendations
- ✓ Future research should focus on exploring the generalizability of the study's findings to other neural network architectures and tasks.
- ✓ The development of more efficient and effective methods for discovering and refining latent structure in neural networks is a critical area of research that holds significant promise for improving generalization and transfer learning capabilities.
Sources
Original: arXiv - cs.LG