ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

ICED (Interpretable ConcEpt Decomposition) is a novel machine unlearning framework for Vision-Language Models (VLMs) like CLIP, addressing the limitations of instance-level unlearning which often removes unrelated semantics. ICED constructs a compact, task-specific concept vocabulary from the forgetting set using a multimodal large language model (MLLM). It then decomposes visual representations into sparse, non-negative combinations of these semantic concepts, enabling fine-grained knowledge manipulation. The method formulates unlearning as a concept-level optimization problem, selectively suppressing target concepts while preserving intra-instance non-target semantics and global cross-modal knowledge. Experiments on CIFAR-10 and ImageNet-1K, using CLIP RN50 and RN101 backbones, demonstrate that ICED achieves more comprehensive target forgetting and better preserves non-target knowledge and model utility compared to existing VLM unlearning methods, with a superior average score across various benchmarks.

Key takeaway

For research scientists and engineers developing or deploying Vision-Language Models, ICED offers a more precise approach to machine unlearning. If you need to remove specific concepts (e.g., sensitive data, copyrighted content) from a VLM without degrading its general utility or unrelated knowledge, consider implementing concept-level decomposition and optimization. This method significantly reduces collateral damage compared to traditional instance-level unlearning, making your models more compliant and robust to data removal requests.

Key insights

Concept-level unlearning in VLMs precisely removes target knowledge by decomposing visual representations into semantic concepts.

Principles

Method

ICED constructs a task-specific concept vocabulary via MLLM, aligns modalities, and decomposes visual features into sparse concept combinations. It then optimizes three loss functions for forgetting, intra-instance preservation, and global knowledge retention.

In practice

Topics

Code references

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.