Learning to Recover Task Experts from a Multi-Task Merged Model

2026-06-25 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, medium

Summary

The ReTeX (Recover Task eXpert) framework addresses parameter interference in multi-task model merging, a common issue where consolidating task-specific experts into a unified model degrades individual task performance. Unlike dynamic merging methods that incur high storage and loading costs for redundant components, ReTeX operates from a single merged checkpoint. It models parameter interference as additive offsets resulting from affine transformations during merging, then predicts these offsets to restore task-expert performance. A novel router-free task identifier, based on SVD subspace signatures computed offline, selects the appropriate expert at inference by minimizing projection residuals for a given input. This approach recovers over 95% of individual-expert performance across both vision and NLP domains and significantly enhances generalization to unseen tasks, demonstrating emergent adaptive interpolation for out-of-distribution scenarios.

Key takeaway

For Machine Learning Engineers building multi-task models, if you are struggling with performance degradation from parameter interference in merged checkpoints, ReTeX offers a solution to recover individual task expert performance. You can achieve over 95% of original expert performance from a single merged model, avoiding the storage overhead of dynamic merging. Consider integrating this offset prediction and SVD-based task identification to enhance generalization and adaptively handle out-of-distribution tasks.

Key insights

ReTeX recovers individual task expert performance from a single merged model by predicting and undoing parameter interference via additive offsets.

Principles

Parameter interference in merged models can be modeled as affine transformations.
Additive offsets can approximate parameter perturbations for expert recovery.
SVD subspace signatures enable router-free task identification.

Method

ReTeX predicts additive offsets to reverse parameter perturbations from merging. An offline SVD subspace signature-based identifier selects the task expert at inference by finding the smallest projection residual for the input.

In practice

Recover >95% individual expert performance.
Improve generalization to unseen tasks.
Adaptively interpolate knowledge for OOD tasks.

Topics

Multi-task Learning
Model Merging
Parameter Interference
Task Experts
SVD Subspace Signatures
Out-of-Distribution Generalization

Code references

Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.