CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A new system-level security defense for Computer Use Agents (CUAs) against prompt injection attacks is introduced, adapting the Dual-LLM paradigm to ensure robust control flow integrity. This approach, termed "Single-Shot Planning," enables a Privileged Planner (P-LLM) to generate a complete execution graph with conditional branches before observing any untrusted environment content, addressing the challenge of continuous UI observation in CUAs. Evaluation on the OSWorld benchmark demonstrates that this secure architecture retains up to 57% of the performance of frontier models like Claude Sonnet 4.5 and improves smaller open-source models such as UITars-1.5-7B by up to 19%. However, the research also identifies "Branch Steering" attacks, where malicious UI elements manipulate data flow to trigger unintended but valid paths within the pre-approved plan, necessitating additional redundancy-based defenses like DOM Consistency and Multi-Modal Consensus, which still face limitations against pixel-based attacks.

Key takeaway

For AI Security Engineers deploying Computer Use Agents, you should adopt the Dual-LLM architecture with Single-Shot Planning to achieve strong control flow integrity against instruction injections. While this design retains significant utility, be aware that your agents remain vulnerable to "Branch Steering" attacks that manipulate UI elements to trigger valid but unintended plan paths. Implement redundancy-based defenses like DOM Consistency or Multi-Modal Consensus, but recognize their limitations against sophisticated pixel attacks.

Key insights

Single-Shot Planning with Dual-LLM architecture provides provable control flow integrity for CUAs by upfront plan generation.

Principles

Method

Single-Shot Planning involves a P-LLM generating a complete execution graph with conditional branches using an "Observe-Verify-Act" paradigm before any untrusted observation.

In practice

Topics

Code references

Best for: AI Architect, Research Scientist, CTO, AI Scientist, AI Security Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.