TMP: Tree-structured Mixed-policy Pruning for Large-scale Image Generation and Editing
Summary
A novel Tree-structured Mixed-policy Pruning (TMP) framework has been introduced to address the growing parameter and computation demands of modern image generation models. TMP generalizes across prevalent image tasks like Text-to-Image (T2I) and Image-to-Image (TI2I), and architectures including Mixture-of-Experts (MoE) and Diffusion Transformers (DiT). Experiments demonstrate TMP's efficacy by compressing HunyuanImage-3.0 from 80 billion to 20 billion parameters, a 75% reduction, with limited quality sacrifice. This pruned 20B version can infer on a single 24GB 4090 GPU. Additionally, TMP compressed Z-Image turbo from 6 billion to 4 billion parameters (33% reduction) with negligible degradation.
Key takeaway
For AI Engineers deploying large image generation models, TMP offers a viable path to significantly reduce parameter count and GPU memory footprint. You can compress models like HunyuanImage 3.0 by 75% (80B to 20B) and enable inference on a single 24GB 4090 GPU, making high-fidelity models more accessible and cost-effective for production environments. Consider integrating TMP to optimize your existing step-distilled models.
Key insights
TMP is a tree-structured mixed-policy pruning framework for compressing large image generation models across various architectures and tasks.
Principles
- Pruning significantly reduces model parameters.
- Mixed-policy pruning generalizes across architectures.
- Step-distilled models can be further optimized.
Method
TMP applies a tree-structured mixed-policy pruning framework to large image generation models, generalizing across T2I/TI2I tasks and MoE/DiT architectures, including step-distilled models.
In practice
- Compress HunyuanImage 3.0 to 20B parameters.
- Enable 20B model inference on a 24GB 4090 GPU.
- Reduce Z-Image turbo from 6B to 4B.
Topics
- Image Generation
- Model Pruning
- Diffusion Transformers
- Mixture-of-Experts
- HunyuanImage-3.0
- GPU Optimization
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.