Domain-Adaptive Model Merging Across Disconnected Modes
Summary
DMM (Domain-Adaptive Model Merging) is a new data-free framework designed to merge highly divergent machine learning models, addressing challenges like data privacy and heterogeneity that prevent centralized training. The framework operates in three steps: independently training domain-specific models, merging similar models using standard techniques, and then synthesizing pseudo-data from normalization statistics to distill knowledge from the more divergent models into the merged model. This lightweight refinement process, guided by synthetic samples, preserves rare but critical knowledge while maintaining stability. Extensive experiments on unimodal (CIFAR-10, CIFAR-100) and multimodal (CrisisMMD) benchmarks demonstrate that DMM achieves state-of-the-art performance, particularly in highly Non-IID settings (e.g., $\alpha=0.01$), outperforming existing federated learning and model merging methods like FedAvg, Cat-Merge, and Git Re-Basin.
Key takeaway
For Machine Learning Engineers building unified models in privacy-sensitive or data-fragmented environments, DMM offers a robust solution. Its ability to merge highly divergent models without requiring original training data, by synthesizing pseudo-data and distilling knowledge, means you can consolidate specialized models more effectively. This approach significantly improves performance in heterogeneous settings, allowing you to build more generalized and stable models while adhering to data privacy constraints.
Key insights
DMM merges divergent models data-free by synthesizing pseudo-data and distilling knowledge, outperforming existing methods.
Principles
- Preserve rare knowledge from divergent models.
- Utilize normalization statistics for data-free pseudo-data generation.
- Address data heterogeneity without sharing original data.
Method
DMM trains models independently, merges similar ones, then synthesizes pseudo-data from normalization statistics to distill knowledge from divergent models into the merged model via lightweight fine-tuning.
In practice
- Apply DMM for privacy-sensitive model consolidation.
- Use DMM to merge models from highly heterogeneous domains.
- Integrate DMM with existing federated learning pipelines.
Topics
- DMM Framework
- Model Merging
- Data-Free Knowledge Distillation
- Domain Adaptation
- Normalization Statistics
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.