When Do Local Score Models Extrapolate Across Size? A Diagnostic Theory and Benchmark
Summary
A new diagnostic theory and benchmark, "When Do Local Score Models Extrapolate Across Size?", addresses the challenge of size transfer in scientific generative modeling. This work reveals that while translation-invariant architectures allow evaluation on larger systems, stable extrapolation is primarily determined by the quasi-locality of the Gaussian-smoothed score, not just architectural locality. The theory, formalized with a size-uniform comparison theorem for local marginals under reverse diffusion, explains how distant perturbations can impact local score components via posterior covariance. It posits that a local model requires its receptive field to encompass the smoothed score's response range for success. To validate this, the paper introduces Finite-Depth Local Flow (FDLF), a white-box benchmark providing exact scores, densities, and controllable response ranges. Empirical results confirm that stable extrapolation occurs when spatial mixing maintains the smoothed score's quasi-locality relative to the receptive field, whereas weakened spatial mixing leads to extrapolation failure.
Key takeaway
For AI Scientists and Research Scientists developing generative models for scientific applications, understanding size transfer is critical. You should prioritize designing models where the receptive field adequately covers the Gaussian-smoothed score's response range, especially when spatial mixing is a factor. Your model's architectural locality alone is insufficient; stable extrapolation hinges on the smoothed score's quasi-locality. Consider using the Finite-Depth Local Flow (FDLF) benchmark to diagnose and validate your model's size extrapolation capabilities.
Key insights
Stable size extrapolation in generative models depends on the Gaussian-smoothed score's quasi-locality relative to the model's receptive field.
Principles
- Architectural locality alone does not guarantee stable size extrapolation.
- Smoothed score quasi-locality governs stable size extrapolation.
- Receptive field must cover the smoothed score's response range.
Method
The paper formalizes a size-uniform comparison theorem for local marginals under reverse diffusion. It introduces Finite-Depth Local Flow (FDLF), a white-box diagnostic benchmark with exact scores, densities, and controllable response ranges.
In practice
- Evaluate model receptive field against smoothed score response.
- Use FDLF benchmark for diagnostic testing of size transfer.
- Ensure sufficient spatial mixing for stable extrapolation.
Topics
- Generative Modeling
- Size Transfer
- Score-based Models
- Gaussian-smoothed Score
- Receptive Fields
- Finite-Depth Local Flow
- Spatial Mixing
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.