CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A new system-level security defense for Computer Use Agents (CUAs) against prompt injection attacks is introduced, adapting the Dual-LLM paradigm to ensure robust control flow integrity. This approach, termed "Single-Shot Planning," enables a Privileged Planner (P-LLM) to generate a complete execution graph with conditional branches before observing any untrusted environment content, addressing the challenge of continuous UI observation in CUAs. Evaluation on the OSWorld benchmark demonstrates that this secure architecture retains up to 57% of the performance of frontier models like Claude Sonnet 4.5 and improves smaller open-source models such as UITars-1.5-7B by up to 19%. However, the research also identifies "Branch Steering" attacks, where malicious UI elements manipulate data flow to trigger unintended but valid paths within the pre-approved plan, necessitating additional redundancy-based defenses like DOM Consistency and Multi-Modal Consensus, which still face limitations against pixel-based attacks.

Key takeaway

For AI Security Engineers deploying Computer Use Agents, you should adopt the Dual-LLM architecture with Single-Shot Planning to achieve strong control flow integrity against instruction injections. While this design retains significant utility, be aware that your agents remain vulnerable to "Branch Steering" attacks that manipulate UI elements to trigger valid but unintended plan paths. Implement redundancy-based defenses like DOM Consistency or Multi-Modal Consensus, but recognize their limitations against sophisticated pixel attacks.

Key insights

Single-Shot Planning with Dual-LLM architecture provides provable control flow integrity for CUAs by upfront plan generation.

Principles

UI workflows, though dynamic, are structurally predictable.
Control flow integrity in agents requires architectural isolation.
Data flow attacks exploit valid, pre-planned execution paths.

Method

Single-Shot Planning involves a P-LLM generating a complete execution graph with conditional branches using an "Observe-Verify-Act" paradigm before any untrusted observation.

In practice

Implement Dual-LLM with Single-Shot Planning for CUA security.
Use "Observe-Verify-Act" for robust plan generation.
Employ DOM Consistency or Multi-Modal Consensus for data-flow defense.

Topics

Computer Use Agents
Dual-LLM Architecture
Prompt Injection Attacks
Control Flow Integrity
Branch Steering Attacks
Single-Shot Planning

Code references

Best for: AI Architect, Research Scientist, CTO, AI Scientist, AI Security Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.