From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models
Summary
MAGE, a Memory-grAph Guided Erasure framework, addresses the privacy and legal concerns of Large Language Models (LLMs) memorizing sensitive content by enabling corpus-free unlearning. Unlike existing methods that rely on user-provided "forget sets," MAGE requires only a minimal user anchor (e.g., an entity name). It then probes the target LLM to recover related memorization, organizes this into a weighted local memory graph, and synthesizes scoped supervision for unlearning. MAGE is model-agnostic and does not need access to the original training corpus. Experiments on the TOFU and RWKU benchmarks demonstrate that MAGE's self-generated supervision achieves unlearning performance comparable to methods using external reference data, while effectively preserving overall model utility. This supports a more practical and auditable unlearning workflow.
Key takeaway
For NLP engineers and research scientists developing or deploying LLMs, MAGE offers a robust solution for implementing the "right to be forgotten" without compromising privacy or auditability. Its ability to generate effective unlearning supervision from minimal user input, rather than sensitive forget sets, significantly reduces risks of secondary data leakage and malicious abuse. Consider integrating MAGE's corpus-free approach to enhance the security and compliance of your LLM unlearning pipelines, particularly for entity-level knowledge removal.
Key insights
MAGE enables auditable, corpus-free LLM unlearning using minimal anchors and self-generated supervision.
Principles
- Minimize user input for unlearning requests.
- Recover memorization from LLM parameters directly.
- Structure recovered memory as a weighted graph.
Method
MAGE iteratively expands from a minimal anchor to build a weighted local memory graph, then samples strength-weighted paths to synthesize scoped forget and neighbor sets for unlearning.
In practice
- Use minimal anchors for unlearning requests.
- Employ graph-based memory recovery for targeted forgetting.
- Generate neighbor sets to preserve model utility.
Topics
- Memory-Graph Guided Erasure
- Corpus-Free LLM Unlearning
- Data Memorization Risks
- Entity-Level Unlearning
- Scoped Unlearning Supervision
Best for: CTO, NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.