Theory of learning of high-dimensional controlled non-linear dynamical systems (I): models and methods
Summary
Neural ordinary differential equations (Neural ODEs) are analyzed within a theoretical framework that addresses their dual dynamical nature: inference dynamics and training dynamics. This work introduces a class of solvable models for high-dimensional controlled non-linear dynamical systems, trained via online stochastic gradient descent (SGD). The authors apply dynamical mean field theory (DMFT) to solve the training dynamics in the high-dimensional limit, deriving learning curves and comparing results with numerical simulations. The framework is presented as a unifying approach for understanding various settings, including multi-layer neural networks (e.g., ResNets), autoregressive models, and generative models, offering precise characterization of feature learning and parameter optimization.
Key takeaway
For AI Scientists and Research Scientists developing or analyzing high-dimensional neural networks, this work provides a robust theoretical framework. You should consider applying Dynamical Mean Field Theory (DMFT) to precisely characterize the coupled inference and training dynamics of Neural ODEs, especially for architectures like ResNets or autoregressive models. This approach offers a path to derive exact learning curves and predict model alignment, moving beyond empirical observations to theoretically grounded performance understanding.
Key insights
Dynamical Mean Field Theory (DMFT) can exactly solve coupled inference and training dynamics in high-dimensional Neural ODEs.
Principles
- Neural ODEs exhibit dual inference and training dynamics.
- High-dimensional systems can be analyzed via DMFT.
- Teacher-student setups model learning alignment.
Method
The method involves a Lagrangian formulation for online SGD, deriving Euler-Lagrange equations, and solving the training dynamics via Dynamical Mean Field Theory (DMFT) using path integral representation to obtain self-consistent stochastic processes and learning curves.
In practice
- Apply DMFT to analyze deep network training.
- Extend framework to autoregressive models.
- Study generative models via teacher-student setup.
Topics
- Neural Ordinary Differential Equations
- Dynamical Mean Field Theory
- High-Dimensional Systems
- Stochastic Gradient Descent
- Training Dynamics
- Generative Models
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.