From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models

2025-07-25 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

MAGE, a Memory-grAph Guided Erasure framework, addresses the privacy and legal concerns of Large Language Models (LLMs) memorizing sensitive content by enabling corpus-free unlearning. Unlike existing methods that rely on user-provided "forget sets," MAGE requires only a minimal user anchor (e.g., an entity name). It then probes the target LLM to recover related memorization, organizes this into a weighted local memory graph, and synthesizes scoped supervision for unlearning. MAGE is model-agnostic and does not need access to the original training corpus. Experiments on the TOFU and RWKU benchmarks demonstrate that MAGE's self-generated supervision achieves unlearning performance comparable to methods using external reference data, while effectively preserving overall model utility. This supports a more practical and auditable unlearning workflow.

Key takeaway

For NLP engineers and research scientists developing or deploying LLMs, MAGE offers a robust solution for implementing the "right to be forgotten" without compromising privacy or auditability. Its ability to generate effective unlearning supervision from minimal user input, rather than sensitive forget sets, significantly reduces risks of secondary data leakage and malicious abuse. Consider integrating MAGE's corpus-free approach to enhance the security and compliance of your LLM unlearning pipelines, particularly for entity-level knowledge removal.

Key insights

MAGE enables auditable, corpus-free LLM unlearning using minimal anchors and self-generated supervision.

Principles

Minimize user input for unlearning requests.
Recover memorization from LLM parameters directly.
Structure recovered memory as a weighted graph.

Method

MAGE iteratively expands from a minimal anchor to build a weighted local memory graph, then samples strength-weighted paths to synthesize scoped forget and neighbor sets for unlearning.

In practice

Use minimal anchors for unlearning requests.
Employ graph-based memory recovery for targeted forgetting.
Generate neighbor sets to preserve model utility.

Topics

Memory-Graph Guided Erasure
Corpus-Free LLM Unlearning
Data Memorization Risks
Entity-Level Unlearning
Scoped Unlearning Supervision

Best for: CTO, NLP Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.