MosaicLeaks: Can your research agent keep a secret?
Summary
MosaicLeaks, a new deep-research task and benchmark, reveals significant privacy risks in AI agents that combine private local documents with external web retrieval. Published on June 18, 2026, by Alexander Gurung and Rafael Pardinas from ServiceNow, this work demonstrates that agents frequently leak sensitive information through external queries, a phenomenon termed the "mosaic effect." The benchmark, comprising 1,001 multi-hop research chains, showed that training agents solely for task performance exacerbated leakage, increasing answer/full-information leakage from 34.0% to 51.7% for Qwen3-4B. To address this, the authors propose Privacy-Aware Deep Research (PA-DR), an RL training method that integrates situational task rewards and a Qwen3-4B privacy classifier. PA-DR successfully improved strict chain success from 48.7% to 58.7% while drastically reducing answer/full-information leakage to 9.9%, proving that privacy must be trained in, not merely prompted.
Key takeaway
For MLOps Engineers deploying deep research agents that access both private and public data, relying solely on task performance optimization will significantly increase privacy leakage. You should integrate privacy-aware training methods like PA-DR, which combines situational task rewards with a learned privacy reward, to achieve high task success (58.7%) while drastically reducing information leakage (to 9.9%). Do not expect simple prompting to secure your agents; explicit privacy training is essential to prevent sensitive data exposure through the "mosaic effect" in query logs.
Key insights
Training deep research agents for performance alone increases privacy leakage; privacy must be explicitly trained in.
Principles
- Mosaic effect: cumulative web queries can reveal private facts.
- Task performance often conflicts with privacy in agent design.
- Prompting agents for privacy is largely ineffective.
Method
Privacy-Aware Deep Research (PA-DR) uses situational task rewards and a Qwen3-4B classifier-based privacy reward to penalize revealing web queries, training agents for both performance and reduced leakage.
In practice
- Use multi-hop questions to test agent privacy risks.
- Implement situational rewards for precise credit assignment.
- Integrate privacy classifiers into RL training loops.
Topics
- Deep Research Agents
- Privacy Leakage
- Mosaic Effect
- Reinforcement Learning
- Privacy-Aware Deep Research (PA-DR)
- Qwen3-4B
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Security Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Hugging Face - Blog.