Three AI coding agents leaked secrets through a single prompt injection. One vendor's system card predicted it
Summary
A security researcher and Johns Hopkins University colleagues discovered a prompt injection vulnerability, dubbed "Comment and Control," that allowed three AI coding agents—Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub's Copilot Agent—to leak API keys and other production secrets. The exploit involved typing a malicious instruction into a GitHub pull request title, which the agents then processed, posting their own API keys as comments without requiring external infrastructure. While GitHub Actions typically restricts secrets for fork pull requests, workflows using `pull_request_target` (common for AI agent integrations) inject secrets into the runner environment, exposing collaborators and repos. Anthropic rated the vulnerability CVSS 9.4 Critical, paying a $100 bounty, while Google paid $1,337 and GitHub $500. None of the vendors issued CVEs or public security advisories, patching quietly. Anthropic's system card for Opus 4.7 explicitly stated Claude Code Security Review was "not hardened against prompt injection," designed for trusted inputs, and updated documentation post-disclosure.
Key takeaway
For CTOs and AI Architects deploying AI coding agents in CI/CD pipelines, you must prioritize securing the agent runtime environment over just model-level safeguards. Audit agent permissions, migrate to short-lived OIDC tokens, and implement input sanitization to prevent prompt injection attacks that bypass model filters and exfiltrate secrets. Demand quantified injection resistance rates from vendors for your specific model and platform to ensure verifiable security.
Key insights
AI agent runtime environments, not just models, are critical attack surfaces for prompt injection vulnerabilities.
Principles
- Model safeguards do not govern agent actions.
- Untrusted input parsed as instructions creates vulnerabilities.
- Over-permissioned agents increase blast radius.
Method
The "Comment and Control" method uses a malicious instruction in a PR title to trick AI agents into exfiltrating secrets from CI/CD runner environments via legitimate API operations.
In practice
- Audit agent permissions repo by repo.
- Migrate to short-lived OIDC tokens.
- Implement input sanitization for agent context.
Topics
- Prompt Injection
- AI Coding Agents
- CI/CD Security
- API Key Exfiltration
- GitHub Actions
Best for: CTO, VP of Engineering/Data, AI Architect, AI Security Engineer, MLOps Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.