๐บ OpenAI built a crypto thief
Summary
OpenAI, in collaboration with crypto investment firm Paradigm, has released EVMbench, a new benchmark designed to test AI agents' capabilities in identifying, fixing, and exploiting security vulnerabilities in smart contracts. The benchmark revealed that OpenAI's GPT-5.3-Codex agent achieved a 72.2% success rate in exploiting vulnerable smart contracts, a significant increase from GPT-5's 31.9% six months prior. While AI demonstrates strong offensive capabilities, its defensive performance in detecting and patching vulnerabilities is lower, with the best model catching only about 46% of bugs. However, providing a small hint can boost patching success from 39% to 94%. OpenAI is addressing these findings by offering $10M in API credits for cybersecurity researchers, expanding its Aardvark AI security research agent, and launching a Trusted Access for Cyber program for vetted security professionals.
Key takeaway
For CTOs and VPs of Engineering overseeing blockchain or crypto initiatives, this report underscores the urgent need to integrate advanced AI-driven security measures. Your teams should prioritize adopting AI tools for smart contract auditing and defense, as AI-powered threats are rapidly evolving. Consider leveraging OpenAI's Aardvark or similar platforms to proactively identify and mitigate vulnerabilities before attackers exploit them, recognizing that offense currently holds an advantage.
Key insights
AI agents, particularly OpenAI's GPT-5.3-Codex, are highly effective at exploiting smart contract vulnerabilities.
Principles
- AI offense currently outpaces defense in cybersecurity.
- Targeted hints significantly improve AI patching success.
Method
EVMbench evaluates AI agents on their ability to find, fix, and exploit smart contract vulnerabilities, measuring success rates in fund draining and bug detection/patching.
In practice
- Explore EVMbench for smart contract security assessments.
- Investigate AI-powered tools for vulnerability detection.
Topics
- AI Cybersecurity
- Smart Contract Exploitation
- AI Agent Capabilities
- Data Privacy
- Generative AI Ethics
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, AI Product Manager, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Neuron.