Zero-Shot Size Transfer for Neural ODEs on Sparse Random Graphs: Graphon Limits and Adjoint Convergence
Summary
A new study establishes a quantitative theory for zero-shot size transfer in Graph Neural Differential Equations (GNDEs) on sparse random graphs. GNDEs model continuous-time graph dynamics using Neural ODEs, and their local filters suggest training on small graphs for deployment on larger, similar graphs without retraining. The research introduces Graphon Neural Differential Equations (Graphon-NDEs) as infinite-node limits of GNDEs, proving well-posedness. It demonstrates trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions at a rate of O((α_n n)^{-1/2}) for an n-node random graph with sparsity α_n. The study also provides uniform-in-time convergence bounds for adjoint systems governing gradients. Furthermore, it analyzes discretize-then-optimize (DTO) and optimize-then-discretize (OTD) training, showing asymptotic consistency with hidden-state and parameter-gradient discrepancies of orders O(1/M) and O(1/M^2) respectively, under explicit Euler discretization with M steps. Experiments on HSBM and tent graphons, alongside zero-shot transfer across four graphon classes, validate the theoretical rates and deployment accuracy.
Key takeaway
For research scientists developing Graph Neural Differential Equations, this work confirms the viability of zero-shot size transfer. You can train GNDEs on smaller datasets and confidently deploy them on significantly larger, similar graphs without needing retraining. This reduces computational costs and accelerates model deployment. Consider the theoretical convergence rates and DTO/OTD consistency when designing your training and deployment strategies for scalable graph models.
Key insights
Graph Neural Differential Equations (GNDEs) exhibit zero-shot size transfer, enabling training on small graphs for accurate deployment on larger, similar graphs.
Principles
- GNDEs' local filters support zero-shot size transfer.
- Graphon-NDEs define infinite-node limits for GNDE systems.
- DTO and OTD training show asymptotic consistency.
In practice
- Deploy GNDEs on larger graphs without retraining.
- Model continuous-time graph dynamics with GNDEs.
- Evaluate DTO and OTD for GNDE training.
Topics
- Graph Neural Differential Equations
- Zero-Shot Size Transfer
- Graphons
- Sparse Random Graphs
- Neural ODEs
- DTO/OTD Training
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.