Gaussian Mean Field Variational Inference can Overestimate Predictive Variance
Summary
James Odgers, Ben Riegler, Siddharth Swaroop, and Vincent Fortuin demonstrate that Mean Field Variational Inference (MFVI), commonly believed to underestimate posterior variance, can paradoxically overestimate predictive variance. Through an analysis of conjugate Bayesian Linear Regression (BLR), the authors reveal that while MFVI indeed underestimates variance in parameter space, it can yield higher predictive variances than the exact posterior. This overestimation is particularly pronounced in directions where training data is concentrated. A surprising result is that for test points drawn from the training distribution, MFVI's expected predictive variance surpasses that of the exact posterior. The paper illustrates a pathological scenario where MFVI fails to reduce predictive variance compared to the prior for in-distribution data. The authors link these findings to the Cold Posterior Effect, proposing that adjusting the temperature can mitigate this overestimation, bringing predictions closer to the exact posterior, and validate their theory on synthetic and real-world regression tasks.
Key takeaway
For Machine Learning Engineers evaluating uncertainty in Bayesian models, be aware that Mean Field Variational Inference (MFVI) can overestimate predictive variance, particularly for in-distribution data. If your models rely on MFVI for uncertainty quantification, you should specifically test predictive variance in regions dense with training data. Consider exploring temperature scaling techniques to potentially correct this overestimation and achieve more accurate uncertainty estimates closer to exact posteriors.
Key insights
MFVI, despite underestimating parameter variance, can overestimate predictive variance, especially where training data concentrates.
Principles
- MFVI's variance characterization is incomplete.
- Overestimation occurs where training data concentrates.
- Temperature variation can correct MFVI predictive overestimation.
Method
The paper analyzes conjugate Bayesian Linear Regression (BLR) to demonstrate MFVI's predictive variance overestimation. It then connects this to the Cold Posterior Effect and validates findings on regression tasks.
In practice
- Analyze MFVI predictive variance in data-dense regions.
- Consider temperature adjustments for MFVI predictions.
- Validate MFVI uncertainty on in-distribution data.
Topics
- Mean Field Variational Inference
- Predictive Variance
- Bayesian Linear Regression
- Cold Posterior Effect
- Uncertainty Quantification
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.