Critical Copilot vulnerability allowed hackers to steal 2FA code from users

2026-06-16 · Source: AI - Ars Technica · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Advanced, short

Summary

Microsoft recently patched a "max critical" vulnerability in its M365 Copilot AI platform, which researchers from Varonis revealed could allow attackers to steal 2FA codes and other sensitive data from user emails. The core issue stems from AI models' inability to differentiate between legitimate user instructions and malicious commands embedded in third-party content Copilot processes. Varonis devised an exploit chain, named SearchLeak, that bypassed existing guardrails. This involved a Parameter-to-Prompt Injection via a crafted URL in an email, which instructed Copilot to extract data. The exploit leveraged a timing window where Copilot's raw HTML output rendered before guardrails applied, allowing an image request with stolen data to fire. To circumvent restrictions on untrusted sites, the attack used Bing as a trampoline, redirecting the request to an attacker-controlled domain. SearchLeak targeted enterprise-tier M365, potentially exposing emails, meeting invites, SharePoint, and OneDrive content.

Key takeaway

For AI Security Engineers managing M365 Copilot deployments, recognize that current guardrails are temporary fixes for a fundamental LLM vulnerability. Your focus should shift from solely relying on platform-level patches to implementing robust, layered security controls at the network and endpoint levels. Proactively monitor for unusual data exfiltration attempts originating from Copilot-enabled environments, as attackers will continue to find new ways to bypass existing protections.

Key insights

AI models struggle to distinguish user instructions from malicious embedded commands, creating fundamental security vulnerabilities.

Principles

LLMs' "gullibility" to embedded instructions is an incurable root cause.
Guardrails are ad hoc and often circumvented by novel exploit chains.
Timing windows in LLM output rendering can bypass security controls.

Method

Attackers use Parameter-to-Prompt Injection via crafted URLs in emails, instructing Copilot to extract data and embed it in an image URL, then exfiltrate via a trusted domain like Bing.

In practice

Monitor for unusual outbound requests from LLM-enabled services.
Implement strict content security policies for AI outputs.
Regularly audit LLM guardrail effectiveness against new attack vectors.

Topics

M365 Copilot
Prompt Injection
AI Security
Data Exfiltration
Vulnerability Patching
Large Language Models

Best for: CTO, VP of Engineering/Data, Executive, AI Security Engineer, AI Architect, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.