Seeing Without Exposing: Adaptive Privacy Control for Open-World, Context-Hungry MLLMs

2026-06-08 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Anchored Privacy Drifting (APD) is a novel, training-free method designed to address privacy challenges in Multimodal Large Language Models (MLLMs) by adaptively controlling sensitive information exposure while preserving crucial visual context. APD operates in a shared multimodal latent space, steering image denoising trajectories to drift privacy-sensitive elements towards semantically equivalent alternatives, simultaneously anchoring contextual cues to the source image. To evaluate this dual objective, the paper introduces AdaptShield, a comprehensive benchmark covering 22 privacy categories, which combines conventional privacy metrics with MLLM-based assessments of contextual utility. Extensive experiments demonstrate APD's effectiveness, showing average gains of 10.4% on textual categories and 8.5% under MLLM-based evaluation across four MLLM series, including Qwen2.5, Qwen3, InternVL3, and InternVL3.5.

Key takeaway

For MLLM developers and AI scientists building systems that process sensitive visual information, traditional fixed obfuscation methods often degrade model utility. You should consider integrating Anchored Privacy Drifting (APD) to achieve adaptive privacy control. This training-free framework allows you to transform sensitive content while preserving crucial contextual cues, ensuring both strong privacy protection and high-fidelity outputs for downstream MLLM tasks. This approach is particularly valuable for personalized or open-world MLLM applications.

Key insights

Adaptive privacy control for MLLMs balances semantic drifting of sensitive content with contextual anchoring.

Principles

Visual privacy in MLLMs is user-tailored and context-dependent, challenging one-size-fits-all anonymization.
Effective MLLM privacy protection requires balancing concealment with high-fidelity content preservation.
Privacy protection can be formulated as a controllable trajectory in a shared multimodal latent space.

Method

APD steers image denoising in a multimodal latent space using a semantic drift field for sensitive content transformation and a source anchoring field for contextual fidelity, balanced by an adaptive coefficient.

In practice

Apply APD for controllable facial attribute edits like age, gender, or ethnicity.
Use APD to transform textual data, such as credit card numbers, while maintaining format.
Implement this training-free framework for scalable privacy protection across diverse categories.

Topics

Multimodal LLMs
Visual Privacy
Image Anonymization
Diffusion Models
Contextual Preservation
AdaptShield Benchmark

Code references

black-forest-labs/flux

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.