A Regularization-Sharpness Tradeoff for Linear Interpolators

2026-02-16 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This research introduces a "regularization-sharpness tradeoff" for overparameterized linear regression, extending the Interpolating Information Criterion (IIC) to models with $\ell^{p}$ penalties, where $p \geq 1$. The classical bias-variance tradeoff breaks down in overparameterized settings, necessitating new model selection principles. The proposed framework decomposes the IIC into a regularization term, which quantifies the alignment of the regularizer and the interpolator, and a geometric sharpness term, which measures the effect of local perturbations on the interpolating manifold. The study provides a general expression for the IIC for $\ell^{p}$ regularizers ($p \geq 2$) and extends this to the LASSO interpolator with an $\ell^{1}$ regularizer, which induces stronger sparsity. Empirical results using real-world datasets with random Fourier features and polynomials validate the theory, demonstrating that these tradeoff terms effectively distinguish performant linear interpolators from weaker ones and that $\ell^1$ regularization can lead to a more pronounced decrease in the sharpness term.

Key takeaway

Research scientists working with overparameterized linear models should adopt the Interpolating Information Criterion (IIC) and its regularization-sharpness tradeoff for model selection. This framework provides a more accurate assessment of model performance than traditional bias-variance approaches, especially when models perfectly interpolate training data. You should particularly investigate $\ell^1$ regularization, as it can lead to a more favorable tradeoff by significantly reducing the sharpness term, indicating better generalization in high-dimensional settings.

Key insights

A regularization-sharpness tradeoff replaces bias-variance in overparameterized linear models, decomposing model selection into alignment and local perturbation effects.

Principles

Classical information criteria fail in overparameterized settings.
IIC decomposes into regularization and sharpness terms.
Sparsity-inducing $\ell^1$ regularization can significantly reduce sharpness.

Method

The Interpolating Information Criterion (IIC) is decomposed into regularization and sharpness terms. Bayesian duality is used to approximate marginal likelihoods for $\ell^p$ regularizers ($p \geq 2$) and $\ell^1$ (LASSO) interpolators, with empirical validation on datasets using random Fourier features and polynomials.

In practice

Use IIC for model selection in overparameterized linear regression.
Consider $\ell^1$ regularization for potentially better performance.
Analyze regularization and sharpness terms to understand model generalization.

Topics

Regularization-Sharpness Tradeoff
Interpolating Information Criterion
Overparameterized Models
Lp Regularization
LASSO Interpolators

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.