Beyond Prompt Injection

· Source: AI & ML – Radar · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Advanced, medium

Summary

Indirect prompt injection, initially a theoretical risk, became a critical production security concern by late 2025, now ranked #1 by OWASP and identified by NIST as generative AI's greatest flaw. A 2026 academic study demonstrated poisoned emails coercing models to exfiltrate SSH keys in up to 80% of trials without user interaction. The September 2025 ForcedLeak vulnerability (CVSS 9.4) in Salesforce's Agentforce exemplified this, where an agent exfiltrated CRM data to a re-registered trusted domain after processing a malicious Web-to-Lead form description. The article argues that traditional input filtering and prompt hardening are insufficient because prompt injection is an inherent property of LLMs, making the agent's "action" the actual breach. It advocates for a "verify, then trust" approach, where an agent's proposed actions are validated against external, deterministic policies using conventional software, not another LLM. This requires actions to be structured tool calls and implies design principles like least privilege scoped to actions, zero trust for machine identities, and capability contracts at the boundary to prevent catastrophic failures.

Key takeaway

For AI Architects and MLOps Engineers designing agentic systems, your focus must shift from solely hardening inputs to rigorously validating agent actions. Recognize that prompt injection is inherent to LLMs; the true breach occurs at the action layer. Implement deterministic capability contracts at the boundary for all consequential actions, ensuring least privilege is action-scoped and machine identities operate under zero trust. Prioritize inventorying high-blast-radius actions and gating them with auditable, conventional software checks, not another LLM.

Key insights

AI agent security must shift from input filtering to deterministic action validation.

Principles

Method

Validate an agent's proposed actions against external, deterministic policy contracts before execution. This requires actions to be structured tool calls and uses conventional software for enforcement.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, AI Architect, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI & ML – Radar.