DASH: Dual-Branch Score Distillation for Guidance-Calibrated Compact Diffusion Models

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

DASH is a novel dual-branch distillation framework designed to address limitations in compressing class-conditional diffusion models. Existing methods often neglect the unconditional score branch, causing an underdetermined classifier-free guidance gap that can lead to ineffective guidance and degenerate predictions in the student model. DASH resolves this by independently supervising both score branches, employing distinct branch constraints for each training sample and an anchor term to regularize conditional predictions towards ground-truth noise. The framework also incorporates TIRT Transfer, which copies the teacher's per-timestep importance curriculum to the student, eliminating the need for relearning. Experiments on CIFAR-10 and CIFAR-100 datasets demonstrate that DASH achieves 5.9x model compression while preserving quality within 4 FID points of the teacher using 50-step DDIM sampling, significantly outperforming models trained from scratch. Ablation studies confirm unconditional supervision contributes over 60% of the total distillation gain, with curriculum transfer and anchor regularization offering complementary benefits.

Key takeaway

For Machine Learning Engineers tasked with compressing class-conditional diffusion models, you should consider DASH's dual-branch distillation framework. Current compression techniques often compromise classifier-free guidance by neglecting unconditional score branch supervision, leading to ineffective model behavior. By implementing DASH's independent supervision for both score branches and its TIRT Transfer mechanism, you can achieve significant model compression, such as 5.9x, while preserving guidance fidelity and maintaining quality within 4 FID points, outperforming training from scratch.

Key insights

Unsupervised unconditional score branches in diffusion model distillation cause guidance failure; dual-branch supervision is essential for effective compression.

Principles

Method

DASH independently supervises both score branches using distinct constraints and an anchor term, while also transferring the teacher's per-timestep importance curriculum.

In practice

Topics

Best for: Research Scientist, AI Engineer, Computer Vision Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.