Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

A new study demonstrates that a broad class of feedforward neural networks, including standard MLPs, CNNs, GNNs, and transformers with fixed sequence length, possess finite sample complexity in the agnostic PAC (Probably Approximately Correct) learning model. This property holds even with unbounded parameters, challenging previous assumptions that required architecture-specific VC arguments or parameter constraints. The authors show that if a network's layers are definable within an o-minimal structure (a mathematical framework for "tame" sets), then distribution-free learnability is an automatic consequence. This unified framework covers common operations like linear projections, residual connections, attention mechanisms, pooling, and normalization layers. The findings suggest that finite-sample PAC learnability should be considered a baseline for these architectures, shifting the focus of architectural comparison towards inductive biases, symmetries, scalability, and optimization behavior.

Key takeaway

For AI scientists and research scientists evaluating neural network architectures, this work implies that the fundamental learnability (finite sample complexity) of fixed, finite feedforward models is nearly universal, provided their operations are "definable" in an o-minimal structure. You should therefore shift your focus from proving basic learnability to analyzing more nuanced architectural differences like inductive biases, geometric priors, scalability, and optimization characteristics, as these are the true differentiators for practical applications.

Key insights

O-minimal definability ensures finite PAC sample complexity for a wide range of feedforward neural networks.

Principles

Method

The method involves proving that definable gates imply definable neural networks, which then allows for the application of uniform cell decomposition to infer finite VC and pseudo-dimension bounds, leading to PAC learnability guarantees.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.