Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This work systematically analyzes scaling laws for quadratic and diagonal neural networks in the feature learning regime, moving beyond linear models. Drawing on connections with matrix compressed sensing and LASSO, the authors derive a detailed phase diagram for excess risk scaling exponents, showing crossovers and plateau behaviors consistent with empirical observations. The analysis establishes a precise link between these scaling regimes and the spectral properties of trained network weights, providing a theoretical validation for the emergence of power-law tails in weight spectra and their connection to generalization performance. Furthermore, the study demonstrates the non-asymptotic validity of approximate message passing (AMP) state evolution equations, extending their predictive power beyond traditional asymptotic assumptions through extensive numerical experiments.

Key takeaway

For research scientists designing or analyzing shallow neural networks, understanding the derived phase diagram of excess risk and weight spectra is crucial. You should carefully tune regularization strength based on sample complexity and network architecture to navigate distinct scaling regimes, avoid harmful overfitting, and achieve Bayes-optimal generalization. Consider implementing pruning strategies, as they can also attain optimal error rates without manual regularization adjustments.

Key insights

A universal theoretical framework for neural scaling laws and weight spectra in shallow networks is established via sparse estimation.

Principles

Method

The method maps shallow neural network training with L2 weight decay to sparse vector (LASSO) and low-rank matrix (compressed sensing) estimation, then uses Approximate Message Passing (AMP) and state evolution for analysis.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.