The surrogate Gibbs-posterior of a corrected stochastic MALA: Towards uncertainty quantification for neural networks

· Source: JMLR · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen, and Mathias Trabs introduce a corrected stochastic MALA (csMALA) method designed to improve uncertainty quantification in neural networks. While standard stochastic MALA (sMALA) scales to large datasets, it samples from a surrogate posterior that deviates from the true Gibbs-posterior due to reduced sample size. The proposed csMALA incorporates a simple correction term that reduces the distance between its surrogate posterior and the original Gibbs-posterior as the full sample size increases, all while maintaining scalability. The authors prove a PAC-Bayes oracle inequality for the surrogate posterior in a nonparametric regression model and demonstrate its application to Bayesian neural networks. They analyze credible ball diameter and coverage for shallow networks and show optimal contraction rates for deep networks, confirming these practical advantages in a high-dimensional simulation study.

Key takeaway

For research scientists developing Bayesian neural networks, understanding the implications of csMALA is crucial. This method offers a scalable approach to more accurately quantify uncertainties by bringing the surrogate posterior closer to the true Gibbs-posterior. You should consider integrating csMALA into your workflow to achieve more reliable credible intervals and contraction rates, especially when working with large datasets where standard sMALA might introduce significant approximation errors.

Key insights

Corrected stochastic MALA (csMALA) improves Gibbs-posterior approximation for scalable uncertainty quantification in neural networks.

Principles

Method

csMALA introduces a simple correction term to standard sMALA, reducing the distance between the surrogate posterior and the Gibbs-posterior as the full sample size increases, thereby improving approximation quality while retaining scalability.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by JMLR.