Dense Neural Networks are not Universal Approximators

2026-04-17 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This research demonstrates that dense neural networks, despite common assumptions of universal approximation, possess intrinsic limitations in their expressive power. While classical universal approximation theorems suggest that sufficiently large architectures can approximate any continuous function without weight restrictions, this study shows that dense ReLU networks, under natural constraints on weights and input/output dimensions, cannot achieve this universality. The argument leverages a model compression approach, interpreting feedforward networks as message passing graph neural networks and combining this with the weak regularity lemma. The findings indicate that for a sufficiently large input dimension, dense networks of fixed depth cannot exploit increased width alone to achieve universal approximation, as their expressive power saturates at a fixed resolution. This motivates the necessity of sparse connectivity for true universality.

Key takeaway

For AI Scientists and Research Scientists designing neural network architectures, recognize that simply increasing the width of dense ReLU networks will not guarantee universal approximation for high-dimensional inputs. Your focus should shift towards incorporating sparse connectivity, as this research indicates it is a necessary component for achieving true universality and overcoming the inherent expressive limitations of dense architectures. This insight is critical for developing more efficient and capable models.

Key insights

Dense neural networks are not universal approximators under practical weight and dimension constraints.

Principles

Expressive power of dense networks saturates with width.
Sparsity is crucial for universal approximation in neural networks.

Method

The study uses a model compression approach, treating dense ReLU networks as message passing graph neural networks and applying the weak regularity lemma to bound their effective size and approximation capabilities.

In practice

Consider sparse architectures for complex function approximation.
Be aware of dense network limitations in high-dimensional inputs.

Topics

Dense Neural Networks
Universal Approximation
Model Compression
Weak Regularity Lemma
Graph Neural Networks

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.