Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem
Summary
A new framework models large language model (LLM) unlearning as an asymmetric two-task problem, prioritizing knowledge retention while treating forgetting as an auxiliary objective. This approach, called retention-prioritized gradient synthesis, decouples gradient extraction from conflict-aware combination. Two methods instantiate this framework: an adaptation of PCGrad and a novel technique named SAGO. Both ensure non-negative cosine similarity with the retain gradient, with SAGO achieving tighter alignment through constructive sign-constrained synthesis. Empirical results on WMDP Bio/Cyber and RWKU benchmarks demonstrate SAGO's effectiveness, improving target model MMLU performance recovery from 44.6% (naive) to 96.0% on WMDP Bio (SimNPO+GD), while maintaining strong forgetting capabilities. The research highlights that reshaping gradient geometry is more critical than loss re-balancing for managing unlearning-retention trade-offs.
Key takeaway
For AI Engineers and Research Scientists developing unlearning mechanisms for LLMs, consider adopting a retention-prioritized gradient synthesis approach. This method, particularly with SAGO, significantly improves the recovery of general model performance (e.g., MMLU) while maintaining effective forgetting. Focus your efforts on reshaping gradient geometry rather than merely re-balancing loss functions to mitigate the unlearning-retention trade-off in your models.
Key insights
LLM unlearning can be reframed as an asymmetric two-task problem prioritizing retention over forgetting.
Principles
- Retention is primary, forgetting auxiliary.
- Reshape gradient geometry, not loss balance.
Method
A retention-prioritized gradient synthesis framework decouples task-specific gradient extraction from conflict-aware combination, using methods like PCGrad adaptation or SAGO for conflict resolution.
In practice
- Use SAGO for tighter gradient alignment.
- Prioritize retention in LLM unlearning tasks.
Topics
- LLM Unlearning
- Asymmetric Two-Task Learning
- Gradient Synthesis
- SAGO Method
- PCGrad
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.