Trajectory-Level Redirection Attacks on Vision-Language-Action Models
Summary
Trajectory-Level Redirection Attacks target Vision-Language-Action (VLA) policies, which integrate natural language into closed-loop robot control. This new threat model, termed "command-preserving trajectory redirection," focuses on prompts that appear benign but subtly alter a robot's final physical outcome over an entire task trajectory. Unlike prior attacks targeting low-level actions, this vulnerability exploits the recurring role of text prompts in VLA replanning. Researchers formalized this as a prompt-only threat where the attacker selects a single, near-benign prompt before an episode, and policy/environment components remain fixed. An "on-policy prompt search method" using rollouts was developed to discover such adversarial prompts. Experiments in both simulation and hardware confirmed that these subtle prompt perturbations successfully redirect VLA rollouts to attacker-specified targets, highlighting a significant instruction grounding vulnerability.
Key takeaway
For AI Security Engineers evaluating robotic systems, this research indicates a critical vulnerability in VLA models. You should prioritize testing VLA policies for "command-preserving trajectory redirection" attacks, as seemingly innocuous prompt changes can hijack robot task outcomes. Implement robust prompt validation and anomaly detection mechanisms to mitigate risks from subtle adversarial inputs that exploit the recurring role of language in robot control.
Key insights
VLA models are vulnerable to subtle, command-preserving prompt attacks that redirect robot task outcomes.
Principles
- Text prompts have a recurring role in VLA control.
- Adversarial prompts can redirect entire task trajectories.
- Near-benign prompts can control robot physical outcomes.
Method
An on-policy prompt search method uses rollouts to discover perturbations. These perturbations track a target task while satisfying command-preserving constraints.
In practice
- Test VLA policies against trajectory-level redirection.
- Implement prompt validation for VLA systems.
Topics
- Vision-Language-Action Models
- Robot Control
- Adversarial Attacks
- Prompt Engineering
- Trajectory Redirection
- AI Security
Best for: Research Scientist, AI Scientist, Robotics Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.