The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision · Depth: Expert, quick

Summary

A study on the Frechet Inception Distance (FID), the standard metric for image generation, reveals significant hidden randomness affecting reported scores. Researchers treated FID as a random variable across training and generation seeds, measuring its variance on hundreds of SiT networks trained on class-conditional ImageNet 256x256. Findings indicate that retraining a model with a different seed shifts FID 3.2x more than simply resampling from a fixed network. This variability stems from random initialization, data ordering, and Gaussian noise in the flow-matching loss. Crucially, increasing compute or model size offers minimal improvement, with the FID coefficient of variation (CoV) remaining within a 1-2% band. Per-cell classifier-free-guidance tuning can halve this spread, but a "lucky" training seed can achieve the same FID with up to 2x less compute than an "unlucky" one.

Key takeaway

For machine learning engineers evaluating generative models, your current FID reporting practices may be misleading due to hidden randomness. You should adopt the new protocol: evaluate under per-cell optimal guidance, consider any FID gap below ~1.3% CoV inconclusive, and always report an error bar over several training seeds. This ensures more robust and reproducible benchmark comparisons, preventing misinterpretation of model performance differences.

Key insights

FID scores are highly variable due to training randomness, making single reported numbers unreliable and requiring new evaluation protocols.

Principles

Method

Researchers treated FID as a random variable on a two-axis panel of training and generation seeds, directly measuring its variance across hundreds of SiT networks on ImageNet 256x256.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.