Separating Geometry from Probability in the Analysis of Generalization

2026-04-22 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This paper introduces a non-stochastic theory of generalization in machine learning, departing from traditional probabilistic assumptions like independent and identically distributed (i.i.d.) data. It reinterprets generalization through the lens of sensitivity analysis of optimization problems, where data acts as a perturbation parameter. The authors derive deterministic generalization bounds as variational principles, quantifying the relationship between in-sample and out-of-sample evaluations via an error term that measures data dissimilarity. This framework allows statistical assumptions to be applied ex post to characterize when this error term is small. The work explores these principles across various machine learning contexts, including minimum-norm interpolation, hard-margin support vector machines, and scenarios with quadratic growth assumptions, demonstrating how deterministic bounds can recover optimal probabilistic generalization bounds.

Key takeaway

For research scientists developing or analyzing machine learning models, this work suggests a powerful shift from purely probabilistic generalization theories. You should consider integrating deterministic sensitivity analysis and variational principles into your model evaluation toolkit. This approach offers a more verifiable foundation for understanding how models perform on unseen data, allowing you to decouple geometric properties from statistical assumptions and potentially derive tighter, more robust generalization bounds.

Key insights

Generalization can be analyzed deterministically via sensitivity of optimization solutions to data perturbations.

Principles

Generalization bounds relate in-sample and out-of-sample evaluations.
Data dissimilarity quantifies the error term in generalization bounds.
Probabilistic assumptions can characterize error term magnitude ex post.

Method

The method involves perturbation analysis of optimality conditions in machine learning optimization problems, deriving variational principles that link in-sample and out-of-sample performance through deterministic bounds.

In practice

Apply sensitivity analysis to understand model stability under data changes.
Use deterministic bounds to evaluate out-of-sample performance.
Characterize data dissimilarity to predict generalization error.

Topics

Non-Stochastic Generalization
Sensitivity Analysis
Parametric Programming
Variational Principles
Minimum-Norm Interpolation

Best for: Research Scientist, AI Scientist, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.