Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
Summary
Neural Low-Degree Filtering (Neural LoFi) is a novel theoretical framework that models deep neural network learning as an iterative spectral procedure, moving beyond the lazy regime. It posits that hierarchical feature learning occurs layer by layer, where each layer selects directions with maximal low-degree correlation to the label. This approach offers a tractable surrogate for deep learning, providing a kernel-space interpretation and explaining how representations are selected, concepts emerge with specific sample complexities, and depth constructs new features through low-degree compositionality. Mechanistic experiments on fully connected and convolutional architectures demonstrate that Neural LoFi outperforms lazy random-feature baselines, recovers structured filters, and aligns with early gradient-descent feature discovery on real datasets like CIFAR-10 and CelebA. The framework also provides a data-driven criterion for feature emergence based on the residual effective dimension of the current kernel.
Key takeaway
For AI Scientists and Research Scientists aiming to understand or optimize deep learning's internal mechanisms, Neural LoFi offers a powerful, interpretable lens. Its layer-wise spectral filtering and low-degree compositionality principles explain how complex features emerge and why depth is computationally advantageous. You should consider Neural LoFi's insights for designing more efficient architectures, particularly for initialization or feature pretraining, and utilize its emergence criterion to predict when new concepts become learnable with increasing data.
Key insights
Neural LoFi explains deep learning's hierarchical feature acquisition via iterative, label-correlated spectral filtering.
Principles
- Feature learning involves a relevance–complexity trade-off.
- Depth enables low-degree compositionality by transforming high-degree input structure.
- Feature emergence is a phase transition driven by signal-to-noise ratios.
Method
Neural LoFi iteratively projects current representations onto leading eigen-directions of a label-weighted moment operator, then lifts these features via a nonlinear random feature map to form the next layer's input.
In practice
- Use Neural LoFi for efficient feature pretraining or initialization.
- Apply spectral pruning to control effective dimension in large networks.
- Diagnose feature emergence using the effective-dimension criterion.
Topics
- Neural Low-Degree Filtering
- Spectral Theory
- Hierarchical Feature Learning
- Representation Learning
- Feature Emergence
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.