Closed-Form Spectral Regularization for Multi-Task Model Merging

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

Closed-Form Spectral Regularization introduces a novel approach to multi-task model merging, a technique that combines independently fine-tuned expert models into a single multi-task model without training data, significantly reducing storage and serving costs. Current state-of-the-art merging methods, which frame the problem as layer-wise quadratic interference minimization, surprisingly rely on hundreds of gradient descent iterations despite an exact closed-form solution. This research identifies that iterative solvers function as implicit spectral regularizers for an ill-posed normal equation, rather than mere optimizers. The authors propose SWUDI, a closed-form spectral filtering estimator combining a soft exponential filter with a hard top-K truncation to suppress noise. Its adaptive variant, SWUDI-A, enhances robustness with per-layer rank rules. Both SWUDI and SWUDI-A require only a single symmetric eigendecomposition per linear layer and no training data. These spectral solvers match or outperform state-of-the-art methods across four general benchmarks and a multimodal benchmark (VQA, Geometry, Chart, OCR, Grounding, modality merging), reducing wall-clock time by 28-72x and peak GPU memory by up to 50%.

Key takeaway

For Machine Learning Engineers optimizing large foundation model deployments, this research offers a critical shift in multi-task model merging. You should consider adopting closed-form spectral solvers like SWUDI or SWUDI-A to replace traditional iterative gradient descent methods. This will drastically reduce your model merging wall-clock time by 28-72x and peak GPU memory by up to 50%, enabling faster iteration and more efficient resource utilization without compromising performance across diverse benchmarks.

Key insights

Iterative model merging acts as spectral regularization, which can be replaced by efficient closed-form spectral filtering.

Principles

Method

Formalize merging as a noisy linear inverse problem, then apply a spectral filtering estimator (SWUDI/SWUDI-A) using symmetric eigendecomposition per layer.

In practice

Topics

Best for: MLOps Engineer, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.