Deep Neural Regression Collapse
arXiv:2603.23805v1 Announce Type: new Abstract: Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the last layer. In this paper, we establish that Neural Regression Collapse (NRC) also occurs below the last layer across different types of models. We show that in the collapsed layers of neural regression models, features lie in a subspace that corresponds to the target dimension, the feature covariance aligns with the target covariance, the input subspace of the layer weights aligns with the feature subspace, and the linear prediction error of the features is close to the overall prediction error of the model. In addition to establishing Deep NRC, we also show that models that exhibit Deep NRC learn the intrinsic dimension of low rank targets and explore the necessity of weight decay in inducing Deep NRC. This p
arXiv:2603.23805v1 Announce Type: new Abstract: Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the last layer. In this paper, we establish that Neural Regression Collapse (NRC) also occurs below the last layer across different types of models. We show that in the collapsed layers of neural regression models, features lie in a subspace that corresponds to the target dimension, the feature covariance aligns with the target covariance, the input subspace of the layer weights aligns with the feature subspace, and the linear prediction error of the features is close to the overall prediction error of the model. In addition to establishing Deep NRC, we also show that models that exhibit Deep NRC learn the intrinsic dimension of low rank targets and explore the necessity of weight decay in inducing Deep NRC. This paper provides a more complete picture of the simple structure learned by deep networks in the context of regression.
Executive Summary
This article presents a novel phenomenon in deep neural networks called Deep Neural Regression Collapse (NRC), where the features learned by the network in earlier layers are aligned with the target dimension, leading to improved regression performance. The authors demonstrate that NRC occurs across different types of models and is essential for learning the intrinsic dimension of low-rank targets. They also show that weight decay plays a crucial role in inducing NRC. The study provides a comprehensive understanding of the simple structure learned by deep networks in regression tasks and has implications for the design of more efficient and effective neural networks.
Key Points
- ▸ Deep Neural Regression Collapse (NRC) is a phenomenon where features learned by the network in earlier layers are aligned with the target dimension.
- ▸ NRC occurs across different types of models, including fully connected and convolutional neural networks.
- ▸ Weight decay is necessary for inducing NRC and improving regression performance.
Merits
Strength
The study provides a comprehensive understanding of the simple structure learned by deep networks in regression tasks, which can inform the design of more efficient and effective neural networks.
Strength
The authors demonstrate that NRC is essential for learning the intrinsic dimension of low-rank targets, which can improve the generalizability of the network.
Demerits
Limitation
The study focuses on regression tasks and may not generalize to other types of problems, such as classification.
Expert Commentary
The study provides a significant contribution to the field of deep learning by shedding light on the simple structure learned by deep networks in regression tasks. The authors demonstrate that NRC is a general phenomenon that occurs across different types of models and is essential for learning the intrinsic dimension of low-rank targets. The study also highlights the importance of weight decay in inducing NRC and improving regression performance. However, the study's focus on regression tasks may limit its generalizability to other types of problems. Nevertheless, the study has significant implications for the design of more efficient and effective neural networks, particularly in regression tasks.
Recommendations
- ✓ Future research should explore the generalizability of NRC to other types of problems, such as classification.
- ✓ Researchers should investigate the use of weight decay as a regularization technique for improving regression performance in deep neural networks.
Sources
Original: arXiv - cs.LG