Representation Costs in Data Science: Foundations and the Quasi-Banach Spaces of Deep Neural Networks

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new general framework analyzes representation costs for parametric data-fitting methods by examining their parameter-space regularizers. This approach defines representation costs for arbitrary parametric models and reveals their induced native function spaces, unifying existing function-space views. The framework proves representer theorems for parametric methods on these native spaces and rigorously connects parametric and nonparametric descriptions under sufficient overparameterization. It encompasses classical methods like kernel methods (reproducing kernel Hilbert spaces) and wavelets (Besov spaces). Crucially, for depth-$L$ feedforward ReLU networks, the framework shows their induced native spaces are $p$-normable quasi-Banach spaces with $p = 2/L$, indicating that deep neural network inductive bias cannot be captured by norms for depths $L > 2$.

Key takeaway

For AI scientists researching deep learning theory, this framework offers a rigorous foundation for understanding inductive biases. You should consider how network depth $L$ directly impacts the native function space, particularly for ReLU networks where $p=2/L$ implies that standard norms cannot capture the inductive bias for $L > 2$. This insight is critical when designing architectures or interpreting model generalization capabilities.

Key insights

A framework defines representation costs and native function spaces for parametric models, revealing deep neural network inductive biases.

Principles

Method

The framework analyzes parametric data-fitting methods via parameter-space regularizers to define representation costs and identify induced native function spaces, unifying existing views.

Topics

Best for: Research Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.