Academic

Neural collapse in the orthoplex regime

arXiv:2603.20587v1 Announce Type: new Abstract: When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.

arXiv:2603.20587v1 Announce Type: new Abstract: When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.

Executive Summary

This article delves into the phenomenon of neural collapse in neural networks, specifically in the orthoplex regime where the number of classes exceeds the dimension of the feature space. The authors utilize Radon's theorem and convexity to characterize the emergent geometric figures in this regime. Building on previous work where the dimension of the feature space and the number of classes satisfy a specific relationship, this study expands the understanding of neural collapse to a broader range of applications. The findings have significant implications for the development and training of neural networks, particularly in language models.

Key Points

  • Neural collapse occurs in the orthoplex regime where d+2 ≤ n ≤ 2d.
  • The authors employ Radon's theorem and convexity to characterize the emergent geometric figures.
  • The study builds on previous work and expands the understanding of neural collapse.

Merits

Strength

The article provides a comprehensive analysis of neural collapse in the orthoplex regime, expanding the understanding of this phenomenon in neural networks.

Demerits

Limitation

The article assumes a specific relationship between the dimension of the feature space and the number of classes, which may limit its applicability to other domains.

Expert Commentary

This article represents a significant contribution to the field of neural network geometry, providing new insights into the behavior of neural networks in the orthoplex regime. The authors' use of Radon's theorem and convexity to characterize the emergent geometric figures is a masterful application of mathematical techniques. The study's findings have far-reaching implications for the development and training of neural networks, particularly in language models. However, the article's assumption of a specific relationship between the dimension of the feature space and the number of classes may limit its applicability to other domains. Nevertheless, this study is a crucial step towards a deeper understanding of neural collapse and its implications for AI systems.

Recommendations

  • Future studies should explore the implications of neural collapse in other regimes and domains.
  • The use of Radon's theorem and convexity should be further explored in the context of neural network geometry.

Sources

Original: arXiv - cs.LG