WFM: 3D Wavelet Flow Matching for Ultrafast Multi-Modal MRI Synthesis

2024-01-29 · Source: cs.CV updates on arXiv.org · Field: Science & Research — Health & Medical Research, Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

WFM (Wavelet Flow Matching) is a novel method for ultrafast multi-modal MRI synthesis, addressing the computational cost and inefficiency of traditional diffusion models. Unlike diffusion models that start from pure noise, WFM learns a direct flow from an informed prior, specifically the mean of conditioning MRI modalities in wavelet space, to the target distribution. This approach enables accurate synthesis in just 1-2 integration steps, achieving a 250-1000x speedup over diffusion baselines like cWDM (0.16-0.64s vs. 160s per volume). A single 82M-parameter WFM model, utilizing class conditioning, can synthesize all four BraTS modalities (T1, T1c, T2, FLAIR), replacing four separate diffusion models totaling 326M parameters. On the BraTS 2024 validation set, WFM achieves 26.8 dB PSNR and 0.94 SSIM, a quality competitive with diffusion models but at significantly higher speeds, making real-time MRI synthesis practical for clinical workflows.

Key takeaway

For Computer Vision Engineers developing medical imaging solutions, WFM demonstrates that starting MRI synthesis from an informed prior rather than noise dramatically reduces inference time. You should consider adopting flow matching with wavelet-domain processing and unified architectures to achieve sub-second synthesis, enabling real-time clinical applications where latency is critical, even if it means a marginal trade-off in PSNR.

Key insights

Informed priors in wavelet space enable ultrafast, unified multi-modal MRI synthesis with flow matching.

Principles

Start synthesis from an informed prior, not noise.
Share anatomical representations across synthesis tasks.
Wavelet transform preserves information losslessly for efficiency.

Method

WFM constructs an informed prior by averaging conditioning MRI modalities in wavelet space, then learns a direct velocity field to the target contrast using a single 82M-parameter 3D U-Net with class conditioning, integrated in 1-2 steps.

In practice

Use 3D Haar wavelet transform for memory-efficient processing.
Employ class conditioning for unified multi-task architectures.
Consider Euler or Heun ODE solvers for 1-2 step inference.

Topics

Wavelet Flow Matching
Multi-modal MRI Synthesis
Diffusion Models
Flow Matching
Wavelet Transform

Code references

yalcintur/WFM

Best for: Computer Vision Engineer, AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.