Wasserstein bounds for denoising diffusion probabilistic models via the F\"ollmer process

2026-05-19 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This paper establishes new sampling error bounds for Denoising Diffusion Probabilistic Models (DDPMs) in the 2-Wasserstein distance, a metric gaining attention for its statistical relevance. The research presents three main contributions: first, it derives sharp upper bounds for the 2-Wasserstein distance under general Lipschitz-type conditions on the score function and for a broad range of variance schedules, including the cosine schedule, recovering existing sharp error bounds. Second, it demonstrates that these Lipschitz-type conditions imply a logarithmic Sobolev inequality and a quadratic transportation cost inequality for DDPMs, allowing optimal Wasserstein bounds to be derived from Kullback–Leibler divergence bounds. Third, the study shows that for general log-concave target distributions, optimal Wasserstein error bounds are achievable even without a quadratic transportation cost inequality for the target. The analysis reinterprets the DDPM sampler as a discretization of the Föllmer process, offering technical advantages over the conventional reverse Ornstein–Uhlenbeck process view.

Key takeaway

For research scientists developing or analyzing generative models, understanding these new Wasserstein error bounds is crucial. Your work on DDPMs can benefit from adopting the Föllmer process perspective, which offers a more robust framework for error analysis, particularly for initialization and discretization. You should prioritize variance schedules like the cosine schedule and carefully select initial mean values to optimize sampling accuracy and computational efficiency, especially when dealing with log-concave target distributions.

Key insights

DDPM sampling errors in 2-Wasserstein distance are bounded by reinterpreting the process as a Föllmer discretization.

Principles

Lipschitz score functions imply logarithmic Sobolev inequalities.
Föllmer process view simplifies bias-variance decomposition.
Appropriate initialization reduces DDPM sampling errors.

Method

The DDPM sampler is analyzed as a discretization of the Föllmer process, which simplifies error decomposition and allows for broader variance schedules and more precise initialization error analysis.

In practice

Use cosine schedule for optimal $O(N^{-1})$ performance.
Initialize DDPM mean close to target mean for error reduction.
Consider log-concave targets for optimal bounds without strong assumptions.

Topics

Denoising Diffusion Probabilistic Models
Wasserstein Distance
Föllmer Process
Sampling Error Bounds
Log-concave Distributions

Best for: Research Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.