Decentralized Proximal Stochastic Gradient Langevin Dynamics
Summary
This paper introduces Decentralized Proximal Stochastic Gradient Langevin Dynamics (DE-PSGLD), a novel Markov chain Monte Carlo (MCMC) algorithm designed for sampling from log-concave probability distributions within a convex constrained domain. Unlike existing decentralized Langevin methods that operate in unconstrained settings, DE-PSGLD enforces constraints using a shared proximal regularization based on the Moreau–Yosida envelope, allowing for unconstrained updates while maintaining consistency with the target posterior. The authors provide non-asymptotic convergence guarantees in the 2-Wasserstein distance for both individual agent iterates and their network averages, quantifying the bias introduced by the proximal approximation. Numerical experiments on synthetic and real datasets, including Bayesian linear and logistic regression, demonstrate DE-PSGLD's efficiency, fast posterior concentration, and high predictive accuracy across various network structures like fully connected, circular, star, and disconnected topologies. The code for these experiments is publicly available.
Key takeaway
For Machine Learning Engineers developing distributed Bayesian inference systems, DE-PSGLD offers a robust method for handling constrained parameter spaces without centralizing data. Your teams can achieve efficient and accurate sampling from complex posterior distributions, even with non-ideal network topologies. Consider implementing DE-PSGLD when privacy or communication bandwidth limits preclude centralized computation, and evaluate its performance using 2-Wasserstein distance to ensure convergence to the target distribution.
Key insights
DE-PSGLD enables decentralized MCMC sampling from constrained log-concave distributions using proximal regularization.
Principles
- Moreau–Yosida envelope enables unconstrained updates for constrained sampling.
- Stronger network connections accelerate agent agreement and convergence.
- Proximal approximation introduces quantifiable bias in the Gibbs distribution.
Method
DE-PSGLD combines weighted averaging of local variables with a stochastic gradient step, incorporating a proximal function for convex constraints. It iteratively updates agents' parameters while communicating with neighbors.
In practice
- Apply DE-PSGLD for Bayesian linear regression with L2-ball constraints.
- Use DE-PSGLD for Bayesian logistic regression on real-world datasets.
- Evaluate network performance using 2-Wasserstein distance.
Topics
- Decentralized Proximal SGLD
- Constrained Sampling
- Moreau–Yosida Regularization
- 2-Wasserstein Distance
- Bayesian Regression
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.