Safe, Scalable, and Accurate Bayes Posterior Sampling for Large-Data Generalized Linear Mixed Models
Summary
This paper introduces a novel stochastic mirror Langevin dynamics (SMLD) algorithm designed for safe, scalable, and accurate Bayesian posterior sampling in large-data generalized linear mixed models (GLMMs). Traditional stochastic gradient Langevin dynamics (SGLD) methods, when applied to re-parameterized constrained parameters like covariance matrices in GLMMs, often lead to divergent Markov chains. The SMLD algorithm addresses this by transferring mirror Langevin dynamics to hierarchical GLMs, ensuring ergodic chains for common GLMM likelihoods. The authors provide a rigorous error analysis, demonstrating that SMLD's squared distance to the target posterior decays like O(n^-δ) for step sizes ε=O(n^-(1+δ)). Furthermore, they propose a non-intrusive post-processing step that corrects the posterior variance estimation bias due to subsampling, yielding an asymptotically order-wise correct estimate. Empirical validation includes simulated experiments and a longitudinal study of pain trajectories in breast cancer survivors, highlighting the method's accuracy and computational efficiency compared to conventional MCMC.
Key takeaway
For AI Scientists and Research Scientists working with large-scale Bayesian GLMMs, adopting the SMLD algorithm is crucial for robust and accurate posterior sampling. Traditional SGLD methods risk divergence with constrained parameters, but SMLD provides algorithmic safety and efficiency. Implement the proposed post-processing step to correct for subsampling bias, ensuring your posterior variance estimates are asymptotically correct and reliable for calibrating Bayesian p-values, especially in biomedical or similar longitudinal studies.
Key insights
SMLD offers a safe, scalable, and accurate Bayesian sampling method for large-data GLMMs with constrained parameters.
Principles
- Smooth re-parameterization can cause SGMCMC divergence.
- Mirror Langevin dynamics ensures ergodic chains for constrained parameters.
- Post-processing can correct subsampling-induced variance bias.
Method
The SMLD algorithm uses a mirror map and stochastic gradients with data subsampling. A post-processing step, based on solving a Lyapunov equation, re-scales samples to correct posterior variance estimates.
In practice
- Apply SMLD for Bayesian inference in large GLMMs.
- Use the proposed post-processing for accurate posterior variance.
- Consider R=1,000 MCMC samples for stochastic gradient estimation.
Topics
- Generalized Linear Mixed Models
- Bayesian Posterior Sampling
- Stochastic Mirror Langevin Dynamics
- Constrained Parameter Inference
- Posterior Variance Correction
Best for: AI Scientist, Research Scientist, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.