Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new framework models large language model (LLM) unlearning as an asymmetric two-task problem, prioritizing knowledge retention while treating forgetting as an auxiliary objective. This approach, called retention-prioritized gradient synthesis, decouples gradient extraction from conflict-aware combination. Two methods instantiate this framework: an adaptation of PCGrad and a novel technique named SAGO. Both ensure non-negative cosine similarity with the retain gradient, with SAGO achieving tighter alignment through constructive sign-constrained synthesis. Empirical results on WMDP Bio/Cyber and RWKU benchmarks demonstrate SAGO's effectiveness, improving target model MMLU performance recovery from 44.6% (naive) to 96.0% on WMDP Bio (SimNPO+GD), while maintaining strong forgetting capabilities. The research highlights that reshaping gradient geometry is more critical than loss re-balancing for managing unlearning-retention trade-offs.

Key takeaway

For AI Engineers and Research Scientists developing unlearning mechanisms for LLMs, consider adopting a retention-prioritized gradient synthesis approach. This method, particularly with SAGO, significantly improves the recovery of general model performance (e.g., MMLU) while maintaining effective forgetting. Focus your efforts on reshaping gradient geometry rather than merely re-balancing loss functions to mitigate the unlearning-retention trade-off in your models.

Key insights

LLM unlearning can be reframed as an asymmetric two-task problem prioritizing retention over forgetting.

Principles

Method

A retention-prioritized gradient synthesis framework decouples task-specific gradient extraction from conflict-aware combination, using methods like PCGrad adaptation or SAGO for conflict resolution.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.