FVD: Inference-Time Alignment of Diffusion Models via Fleming-Viot Resampling

2026-03-15 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

Fleming-Viot Diffusion (FVD) is a novel inference-time alignment method for diffusion models that addresses diversity collapse in Sequential Monte Carlo (SMC)-based samplers. Unlike traditional multinomial resampling, FVD employs a specialized Fleming-Viot birth-death mechanism, integrating independent reward-based survival decisions with stochastic rebirth noise. This approach preserves broader trajectory support and effectively explores reward-tilted distributions without requiring value function approximation or costly rollouts. FVD is fully parallelizable and scales efficiently with inference compute. Empirically, it achieves a 7% improvement in ImageReward on DrawBench, enhances FID by 14–20% on class-conditional tasks compared to strong baselines, and is up to 66x faster than value-based methods like DTS. The method also includes an adaptive control mechanism for alignment strength using a Robbins-Monro update.

Key takeaway

For research scientists and computer vision engineers developing or deploying reward-aligned diffusion models, FVD offers a significant advancement. Its Fleming-Viot resampling and adaptive control mechanism mitigate diversity collapse and over-optimization, leading to higher quality and more diverse samples without expensive fine-tuning or value function learning. You should consider integrating FVD to achieve superior performance and efficiency in tasks like class-conditional posterior sampling and text-to-image generation, especially when balancing reward maximization with sample diversity is critical.

Key insights

FVD uses a Fleming-Viot birth-death mechanism to prevent diversity collapse in diffusion model inference, improving sample quality and efficiency.

Principles

Decouple selection from replication to preserve diversity.
Independent Bernoulli deaths reduce offspring variance to O(1) per particle.
Adaptive control of alignment strength improves robustness.

Method

FVD replaces multinomial resampling with a Fleming-Viot birth-death mechanism, where particles survive based on reward-based probabilities and dead particles are stochastically reborn from survivors. An adaptive Robbins-Monro controller adjusts alignment strength based on particle absorption rate.

In practice

Use FVD for reward-aligned image generation tasks.
Set target absorption rate $\alpha^{*}$ to balance reward and diversity.
Leverage FVD's parallelism for faster inference than value-based methods.

Topics

Fleming-Viot Diffusion
Diffusion Models
Inference-Time Alignment
Sequential Monte Carlo
Diversity Collapse Mitigation

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.