MLUBench: A Benchmark for Lifelong Unlearning Evaluation in MLLMs

2024-01-30 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, extended

Summary

MLUBench is a new, large-scale benchmark designed to evaluate Multimodal Large Language Model (MLLM) lifelong unlearning, a critical problem where models must sequentially remove specific content while preserving general capabilities. Existing benchmarks are limited, failing to capture the cumulative degradation observed in MLLMs. MLUBench features 127 real-world entities across 9 classes, with 5,105 images and 15,414 VQA pairs. Experiments using MLUBench reveal that current unlearning methods suffer severe, cumulative performance degradation and uniquely highlight the challenge of preserving multimodal alignment. To address this, the authors propose LUMoE, a Mixture-of-Experts (MoE) inspired method utilizing switchable Low-Rank Adaptation (LoRA) adapters and a GLM-4V-Plus gate module, which significantly mitigates degradation. The source code and MLUBench dataset are open-sourced.

Key takeaway

For MLOps engineers deploying MLLMs in privacy-sensitive applications, you must account for the severe, cumulative degradation caused by sequential unlearning requests. Traditional methods risk corrupting core model capabilities and multimodal alignment. Consider adopting modular approaches like LUMoE, which uses LoRA adapters and dynamic routing to isolate unlearning tasks, preserving general utility. Evaluate your unlearning strategies rigorously using benchmarks like MLUBench to ensure long-term model stability.

Key insights

MLLM lifelong unlearning uniquely challenges multimodal alignment, requiring isolated, modular solutions to prevent cumulative degradation.

Principles

Lifelong unlearning causes severe, cumulative performance degradation in MLLMs.
Preserving multimodal alignment is crucial for MLLM unlearning.
Isolating unlearning modifications protects base model stability.

Method

LUMoE employs switchable LoRA adapters as "experts" for specific unlearning tasks, with a GLM-4V-Plus gate module dynamically routing multimodal inputs to the appropriate adapter or the original MLLM.

In practice

Utilize LoRA adapters for task-specific unlearning to isolate changes.
Implement a gate module to route unlearning requests dynamically.
Leverage MLUBench for comprehensive MLLM unlearning evaluation.

Topics

Multimodal LLMs
Machine Unlearning
Lifelong Learning
MLLM Benchmarking
LoRA
Mixture-of-Experts
Data Privacy

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.