Learning to Recover Task Experts from a Multi-Task Merged Model
Summary
The ReTeX (Recover Task eXpert) framework addresses parameter interference in multi-task model merging, a common issue where consolidating task-specific experts into a unified model degrades individual performance. ReTeX models this interference as parameter perturbations, approximating them as additive offsets. It then predicts these offsets to recover over 95% of individual-expert performance from a single merged checkpoint across vision and NLP domains. A router-free task identifier, based on offline SVD subspace signatures, selects the appropriate expert when task identity is unknown by minimizing projection residuals. Crucially, ReTeX also demonstrates emergent adaptive interpolation of expert knowledge, significantly improving generalization to unseen and out-of-distribution tasks.
Key takeaway
For Machine Learning Engineers developing multi-task models, ReTeX offers a robust solution to mitigate parameter interference and enhance model generalization. You should consider integrating ReTeX's offset prediction and SVD-based task identification to recover individual expert performance and improve adaptability to unseen or out-of-distribution tasks, streamlining your merged model deployment.
Key insights
ReTeX recovers individual task-expert performance from merged multi-task models by predicting and undoing parameter interference.
Principles
- Parameter interference can be modeled as affine transformations.
- Affine transformations can be approximated as additive offsets.
- SVD subspace signatures enable router-free task identification.
Method
ReTeX predicts additive offsets to reverse parameter perturbations in merged models. It uses SVD subspace signatures, computed offline, to identify tasks at inference by selecting the subspace with the smallest projection residual.
In practice
- Recover individual expert performance from merged models.
- Improve generalization to unseen and OOD tasks.
- Implement router-free task identification using SVD.
Topics
- Multi-task Learning
- Model Merging
- Parameter Interference
- Task Experts
- ReTeX Framework
- SVD Subspace Signatures
- Generalization
Code references
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.