FVD: Inference-Time Alignment of Diffusion Models via Fleming-Viot Resampling
Summary
Fleming-Viot Diffusion (FVD) is a novel inference-time alignment method for diffusion models that addresses diversity collapse in Sequential Monte Carlo (SMC)-based samplers. Unlike traditional multinomial resampling, FVD employs a specialized Fleming-Viot birth-death mechanism, integrating independent reward-based survival decisions with stochastic rebirth noise. This approach preserves broader trajectory support and effectively explores reward-tilted distributions without requiring value function approximation or costly rollouts. FVD is fully parallelizable and scales efficiently with inference compute. Empirically, it achieves a 7% improvement in ImageReward on DrawBench, enhances FID by 14–20% on class-conditional tasks compared to strong baselines, and is up to 66x faster than value-based methods like DTS. The method also includes an adaptive control mechanism for alignment strength using a Robbins-Monro update.
Key takeaway
For research scientists and computer vision engineers developing or deploying reward-aligned diffusion models, FVD offers a significant advancement. Its Fleming-Viot resampling and adaptive control mechanism mitigate diversity collapse and over-optimization, leading to higher quality and more diverse samples without expensive fine-tuning or value function learning. You should consider integrating FVD to achieve superior performance and efficiency in tasks like class-conditional posterior sampling and text-to-image generation, especially when balancing reward maximization with sample diversity is critical.
Key insights
FVD uses a Fleming-Viot birth-death mechanism to prevent diversity collapse in diffusion model inference, improving sample quality and efficiency.
Principles
- Decouple selection from replication to preserve diversity.
- Independent Bernoulli deaths reduce offspring variance to O(1) per particle.
- Adaptive control of alignment strength improves robustness.
Method
FVD replaces multinomial resampling with a Fleming-Viot birth-death mechanism, where particles survive based on reward-based probabilities and dead particles are stochastically reborn from survivors. An adaptive Robbins-Monro controller adjusts alignment strength based on particle absorption rate.
In practice
- Use FVD for reward-aligned image generation tasks.
- Set target absorption rate $\alpha^{*}$ to balance reward and diversity.
- Leverage FVD's parallelism for faster inference than value-based methods.
Topics
- Fleming-Viot Diffusion
- Diffusion Models
- Inference-Time Alignment
- Sequential Monte Carlo
- Diversity Collapse Mitigation
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.