Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks

· Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

The paper "Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks" addresses the challenged heuristic that "flat interpolators generalize well." Previous work by Dinh et al. (2017) showed network symmetries could alter flatness without changing loss, making the heuristic seem vacuous. This research establishes a direct link between flatness and generalization for learning unknown multi-index models using 2-layer non-convex homogeneous neural networks, specifically focusing on "flattest" interpolators. It demonstrates that a class of non-generalizing interpolators exists whose flatness cannot be minimized by symmetries. Furthermore, for data generated by a sum of single-index models with low approximation error and label noise, any flattest interpolator achieves small population loss, consistently generalizing. This connection applies across various activations and realistic data distributions.

Key takeaway

For AI scientists evaluating neural network generalization, this research clarifies that "flattest" interpolators in 2-layer homogeneous networks reliably generalize for multi-index models under specific data conditions. You should consider the "flattest" interpolator concept when analyzing model performance, especially when dealing with low noise and approximation errors. This refines the understanding of flatness as a generalization predictor.

Key insights

Flattest interpolators in homogeneous neural networks generalize well for multi-index models despite symmetries.

Principles

Topics

Best for: Research Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.