Seeing Without Exposing: Adaptive Privacy Control for Open-World, Context-Hungry MLLMs

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Anchored Privacy Drifting (APD) is a novel, training-free method designed to address privacy challenges in Multimodal Large Language Models (MLLMs) by adaptively controlling sensitive information exposure while preserving crucial visual context. APD operates in a shared multimodal latent space, steering image denoising trajectories to drift privacy-sensitive elements towards semantically equivalent alternatives, simultaneously anchoring contextual cues to the source image. To evaluate this dual objective, the paper introduces AdaptShield, a comprehensive benchmark covering 22 privacy categories, which combines conventional privacy metrics with MLLM-based assessments of contextual utility. Extensive experiments demonstrate APD's effectiveness, showing average gains of 10.4% on textual categories and 8.5% under MLLM-based evaluation across four MLLM series, including Qwen2.5, Qwen3, InternVL3, and InternVL3.5.

Key takeaway

For MLLM developers and AI scientists building systems that process sensitive visual information, traditional fixed obfuscation methods often degrade model utility. You should consider integrating Anchored Privacy Drifting (APD) to achieve adaptive privacy control. This training-free framework allows you to transform sensitive content while preserving crucial contextual cues, ensuring both strong privacy protection and high-fidelity outputs for downstream MLLM tasks. This approach is particularly valuable for personalized or open-world MLLM applications.

Key insights

Adaptive privacy control for MLLMs balances semantic drifting of sensitive content with contextual anchoring.

Principles

Method

APD steers image denoising in a multimodal latent space using a semantic drift field for sensitive content transformation and a source anchoring field for contextual fidelity, balanced by an adaptive coefficient.

In practice

Topics

Code references

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.