Deep Neural Regression Collapse
Summary
The paper "Deep Neural Regression Collapse" by Rangamani and Unal establishes that Neural Regression Collapse (NRC) occurs not only at the last layer of deep regression models but also in deeper layers across various architectures. The authors define four conditions for Deep NRC: Noise Suppression (NRC1), where features lie in a subspace corresponding to the target dimension; Signal-Target Alignment (NRC2), where feature covariance aligns with target covariance; Feature-Weight Alignment (NRC3), where the input subspace of layer weights aligns with the feature subspace; and Linear Predictability (NRC4), where linear prediction error from features is close to the overall model's prediction error. Experiments on synthetic data, MuJoCo imitation learning datasets (Swimmer, Reacher, Hopper), and image datasets (Carla2D, UTKFace) using MLPs and ResNets demonstrate that models exhibiting Deep NRC learn the intrinsic dimension of low-rank targets. The study also highlights the necessity of weight decay in inducing Deep NRC.
Key takeaway
For research scientists developing or analyzing deep regression models, understanding Deep Neural Regression Collapse (Deep NRC) is critical. Your model's internal representations are likely simplifying to a low-rank structure aligned with the target, even in intermediate layers. You should investigate the role of weight decay in your training regimens, as it is necessary for inducing this beneficial collapse and ensuring that your models learn generalizable solutions by capturing the intrinsic dimension of the targets rather than merely memorizing them.
Key insights
Deep Neural Regression Collapse extends beyond the last layer, revealing structured feature learning in deep networks.
Principles
- Deep networks learn intrinsic target dimensions.
- Weight decay is crucial for inducing Deep NRC.
- NRC conditions characterize information flow in layers.
Method
The authors define four conditions (NRC1-NRC4) to measure Deep Neural Regression Collapse across layers, using metrics like noise component, Centered Kernel Alignment (CKA), principal angles between subspaces (PABS), and linear prediction MSE.
In practice
- Use weight decay to promote Deep NRC.
- Analyze NRC metrics to identify collapsed layers.
- Infer intrinsic target dimensionality from collapsed layers.
Topics
- Neural Regression Collapse
- Deep Learning Regression
- Feature Learning
- Weight Decay
- Intrinsic Dimension Learning
Code references
Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.