MIT's new fine-tuning method lets LLMs learn new skills without losing old ones

2026-02-11 · Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, medium

Summary

Researchers from MIT, the Improbable AI Lab, and ETH Zurich have developed a new fine-tuning method called Self-Distillation Fine-Tuning (SDFT) that enables large language models (LLMs) to acquire new skills and knowledge without experiencing catastrophic forgetting of previous capabilities. Traditional supervised fine-tuning (SFT) often leads to performance regression on older tasks, while reinforcement learning (RL) struggles with defining reward functions for complex enterprise scenarios and injecting entirely new information. SDFT addresses these limitations by leveraging the LLM's own in-context learning abilities to create an on-policy learning loop, where a frozen "teacher" model provides feedback to a "student" version. Experiments with the Qwen 2.5 model demonstrated SDFT's superior performance in science Q&A (70.2% accuracy vs. 66.2% for SFT), its ability to preserve original knowledge, and its success in sequential learning across tasks like science, tool use, and medical reasoning, offering a path to consolidate multiple skills into a single model.

Key takeaway

For AI Scientists and NLP Engineers developing adaptive enterprise LLMs, SDFT offers a critical solution to catastrophic forgetting. You can now fine-tune models to acquire new, proprietary knowledge and skills sequentially without degrading existing capabilities, potentially reducing the need for "model zoos" and lowering inference costs. Consider integrating SDFT, available on GitHub and in progress for Hugging Face's TRL library, especially for models with strong in-context learning (e.g., Qwen 3 4B+ parameter models) where defining RL reward functions is impractical.

Key insights

SDFT enables LLMs to learn new skills continually without forgetting old ones, using self-distillation and in-context learning.

Principles

On-policy learning prevents catastrophic forgetting.
In-context learning can create self-supervision.
Single models can accumulate diverse skills.

Method

SDFT uses a frozen "teacher" LLM with expert demonstrations to guide a "student" LLM, creating an on-policy learning loop via distillation and in-context learning, without needing explicit reward functions.

In practice

Consolidate multiple LLM skills into one model.
Reduce inference costs by hosting fewer models.
Apply to domains lacking clear reward functions.

Topics

Self-Distillation Fine-Tuning
Continual Learning
Large Language Models
Catastrophic Forgetting
In-Context Learning

Best for: AI Scientist, Research Scientist, NLP Engineer, Machine Learning Engineer, AI Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.