[D] We scanned 18,000 exposed OpenClaw instances and found 15% of community skills contain malicious instructions
Summary
Security research on OpenClaw, a popular autonomous agent platform with 165k GitHub stars, revealed over 18,000 internet-exposed instances. Analysis of its community skill repository found that nearly 15% of skills contained malicious instructions, such as prompts designed for data exfiltration, payload downloads, or credential harvesting. These malicious skills often reappear quickly under new identities after removal. The primary concern is "Delegated Compromise," where agents inherit broad user permissions across digital services, making them a potent attack vector. The supply chain risk is significant due to 700+ community skills lacking systematic security review, with examples ranging from obvious clipboard exfiltration to subtle data leaks via "debug logs." OpenClaw acknowledges these security tradeoffs, but the broader community may not fully grasp the implications.
Key takeaway
For AI Architects and CTOs evaluating autonomous agent deployments, recognize that OpenClaw's model presents a "Delegated Compromise" threat where agents inherit extensive user permissions. You should prioritize robust skill vetting, implement strict sandboxing, and explore capability-based permission models to mitigate the inherent supply chain risks and prevent agent-to-agent propagation of prompt injection attacks.
Key insights
Autonomous agents like OpenClaw present a novel and multiplicative attack surface due to broad permission inheritance and natural language exploit vectors.
Principles
- Agent security is multiplicative, not additive.
- Natural language is an exploit vector.
- Convenience often conflicts with control.
Method
Skill definitions are parsed for patterns like base64 payloads and obfuscated URLs. Behavioral testing involves running skills in isolated environments to monitor for unexpected network calls or file system access.
In practice
- Implement capability-based permissions for skills.
- Utilize behavioral anomaly detection for agent activity.
- Sandbox skills with explicit data access bridges.
Topics
- AI Agent Security
- Prompt Injection Attacks
- Supply Chain Risk
- Autonomous Agents
- Delegated Compromise
Best for: CTO, AI Architect, VP of Engineering/Data, AI Security Engineer, Security Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.