Scalable Mean-Field Variational Inference via Preconditioned Primal-Dual Optimization

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

This work introduces Primal-Dual Variational Inference (PD-VI) and its block-preconditioned variant, P2D-VI, for large-scale mean-field variational inference (MFVI) in mini-batch settings. The authors reformulate MFVI as a constrained finite-sum problem, enabling a novel primal-dual algorithm that jointly updates global and local variational parameters. P2D-VI further enhances efficiency and numerical robustness by adapting updates to heterogeneous loss geometries across parameter blocks using block-preconditioning. The methods establish convergence guarantees with constant step sizes, achieving $\mathcal{O}(1/T)$ convergence to a stationary point in general settings and linear convergence under strong convexity, without relying on conjugacy assumptions or explicit bounded-variance conditions. Numerical experiments on synthetic Gaussian mixture models and a real-world large-scale spatial transcriptomics dataset (MOSTA, 150,000 locations, 20,000 genes) demonstrate that PD-VI and P2D-VI consistently outperform existing stochastic variational inference approaches in convergence speed and solution quality, particularly in spatial domain detection tasks.

Key takeaway

Research Scientists working on large-scale Bayesian inference problems, especially those involving non-conjugate posteriors or high-dimensional latent variables like in spatial transcriptomics, should consider adopting the P2D-VI framework. Its demonstrated superior convergence speed and solution quality, even with constant step sizes and non-i.i.d. mini-batches, offers a significant advantage over traditional SVI methods, potentially accelerating research and improving model accuracy in complex biological or statistical applications.

Key insights

Primal-dual optimization with preconditioning offers scalable, robust, and faster convergence for large-scale mean-field variational inference.

Principles

Method

PD-VI and P2D-VI use a mini-batch primal-dual algorithm based on an augmented Lagrangian formulation, jointly updating global and local variational parameters. P2D-VI adds block-preconditioning to rescale updates for different parameter blocks.

In practice

Topics

Code references

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.