MIT's new fine-tuning method lets LLMs learn new skills without losing old ones

· Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Intermediate, medium

Summary

Researchers from MIT, the Improbable AI Lab, and ETH Zurich have developed a new fine-tuning method called Self-Distillation Fine-Tuning (SDFT) that enables large language models (LLMs) to acquire new skills and knowledge without experiencing catastrophic forgetting of previous capabilities. Traditional supervised fine-tuning (SFT) often leads to performance regression on older tasks, while reinforcement learning (RL) struggles with defining reward functions for complex enterprise scenarios and injecting entirely new information. SDFT addresses these limitations by leveraging the LLM's own in-context learning abilities to create an on-policy learning loop, where a frozen "teacher" model provides feedback to a "student" version. Experiments with the Qwen 2.5 model demonstrated SDFT's superior performance in science Q&A (70.2% accuracy vs. 66.2% for SFT), its ability to preserve original knowledge, and its success in sequential learning across tasks like science, tool use, and medical reasoning, offering a path to consolidate multiple skills into a single model.

Key takeaway

For AI Scientists and NLP Engineers developing adaptive enterprise LLMs, SDFT offers a critical solution to catastrophic forgetting. You can now fine-tune models to acquire new, proprietary knowledge and skills sequentially without degrading existing capabilities, potentially reducing the need for "model zoos" and lowering inference costs. Consider integrating SDFT, available on GitHub and in progress for Hugging Face's TRL library, especially for models with strong in-context learning (e.g., Qwen 3 4B+ parameter models) where defining RL reward functions is impractical.

Key insights

SDFT enables LLMs to learn new skills continually without forgetting old ones, using self-distillation and in-context learning.

Principles

Method

SDFT uses a frozen "teacher" LLM with expert demonstrations to guide a "student" LLM, creating an on-policy learning loop via distillation and in-context learning, without needing explicit reward functions.

In practice

Topics

Best for: AI Scientist, Research Scientist, NLP Engineer, Machine Learning Engineer, AI Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.