Beyond Prompt Injection
Summary
Indirect prompt injection, initially a theoretical risk, became a critical production security concern by late 2025, now ranked #1 by OWASP and identified by NIST as generative AI's greatest flaw. A 2026 academic study demonstrated poisoned emails coercing models to exfiltrate SSH keys in up to 80% of trials without user interaction. The September 2025 ForcedLeak vulnerability (CVSS 9.4) in Salesforce's Agentforce exemplified this, where an agent exfiltrated CRM data to a re-registered trusted domain after processing a malicious Web-to-Lead form description. The article argues that traditional input filtering and prompt hardening are insufficient because prompt injection is an inherent property of LLMs, making the agent's "action" the actual breach. It advocates for a "verify, then trust" approach, where an agent's proposed actions are validated against external, deterministic policies using conventional software, not another LLM. This requires actions to be structured tool calls and implies design principles like least privilege scoped to actions, zero trust for machine identities, and capability contracts at the boundary to prevent catastrophic failures.
Key takeaway
For AI Architects and MLOps Engineers designing agentic systems, your focus must shift from solely hardening inputs to rigorously validating agent actions. Recognize that prompt injection is inherent to LLMs; the true breach occurs at the action layer. Implement deterministic capability contracts at the boundary for all consequential actions, ensuring least privilege is action-scoped and machine identities operate under zero trust. Prioritize inventorying high-blast-radius actions and gating them with auditable, conventional software checks, not another LLM.
Key insights
AI agent security must shift from input filtering to deterministic action validation.
Principles
- Prompt injection is a structural property of LLMs.
- The action, not the injection, constitutes the breach.
- Verify proposed actions deterministically before execution.
Method
Validate an agent's proposed actions against external, deterministic policy contracts before execution. This requires actions to be structured tool calls and uses conventional software for enforcement.
In practice
- Inventory agent actions by blast radius.
- Write deterministic contracts for high-risk actions.
- Implement human review for actions above thresholds.
Topics
- AI Agent Security
- Prompt Injection
- Action Layer Validation
- Deterministic Policy
- Zero Trust Architecture
- Capability Contracts
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, AI Architect, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI & ML – Radar.