Domain-Adaptive Model Merging Across Disconnected Modes

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, long

Summary

DMM (Domain-Adaptive Model Merging) is a new data-free framework designed to merge highly divergent machine learning models, addressing challenges like data privacy and heterogeneity that prevent centralized training. The framework operates in three steps: independently training domain-specific models, merging similar models using standard techniques, and then synthesizing pseudo-data from normalization statistics to distill knowledge from the more divergent models into the merged model. This lightweight refinement process, guided by synthetic samples, preserves rare but critical knowledge while maintaining stability. Extensive experiments on unimodal (CIFAR-10, CIFAR-100) and multimodal (CrisisMMD) benchmarks demonstrate that DMM achieves state-of-the-art performance, particularly in highly Non-IID settings (e.g., $\alpha=0.01$), outperforming existing federated learning and model merging methods like FedAvg, Cat-Merge, and Git Re-Basin.

Key takeaway

For Machine Learning Engineers building unified models in privacy-sensitive or data-fragmented environments, DMM offers a robust solution. Its ability to merge highly divergent models without requiring original training data, by synthesizing pseudo-data and distilling knowledge, means you can consolidate specialized models more effectively. This approach significantly improves performance in heterogeneous settings, allowing you to build more generalized and stable models while adhering to data privacy constraints.

Key insights

DMM merges divergent models data-free by synthesizing pseudo-data and distilling knowledge, outperforming existing methods.

Principles

Method

DMM trains models independently, merges similar ones, then synthesizes pseudo-data from normalization statistics to distill knowledge from divergent models into the merged model via lightweight fine-tuning.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.