Deep Neural Regression Collapse

· Source: cs.NE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

The paper "Deep Neural Regression Collapse" by Rangamani and Unal establishes that Neural Regression Collapse (NRC) occurs not only at the last layer of deep regression models but also in deeper layers across various architectures. The authors define four conditions for Deep NRC: Noise Suppression (NRC1), where features lie in a subspace corresponding to the target dimension; Signal-Target Alignment (NRC2), where feature covariance aligns with target covariance; Feature-Weight Alignment (NRC3), where the input subspace of layer weights aligns with the feature subspace; and Linear Predictability (NRC4), where linear prediction error from features is close to the overall model's prediction error. Experiments on synthetic data, MuJoCo imitation learning datasets (Swimmer, Reacher, Hopper), and image datasets (Carla2D, UTKFace) using MLPs and ResNets demonstrate that models exhibiting Deep NRC learn the intrinsic dimension of low-rank targets. The study also highlights the necessity of weight decay in inducing Deep NRC.

Key takeaway

For research scientists developing or analyzing deep regression models, understanding Deep Neural Regression Collapse (Deep NRC) is critical. Your model's internal representations are likely simplifying to a low-rank structure aligned with the target, even in intermediate layers. You should investigate the role of weight decay in your training regimens, as it is necessary for inducing this beneficial collapse and ensuring that your models learn generalizable solutions by capturing the intrinsic dimension of the targets rather than merely memorizing them.

Key insights

Deep Neural Regression Collapse extends beyond the last layer, revealing structured feature learning in deep networks.

Principles

Method

The authors define four conditions (NRC1-NRC4) to measure Deep Neural Regression Collapse across layers, using metrics like noise component, Centered Kernel Alignment (CKA), principal angles between subspaces (PABS), and linear prediction MSE.

In practice

Topics

Code references

Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.