๐Ÿ˜บ OpenAI built a crypto thief

ยท Source: The Neuron ยท Field: Technology & Digital โ€” Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Blockchain & Distributed Ledger Technology ยท Depth: Intermediate, medium

Summary

OpenAI, in collaboration with crypto investment firm Paradigm, has released EVMbench, a new benchmark designed to test AI agents' capabilities in identifying, fixing, and exploiting security vulnerabilities in smart contracts. The benchmark revealed that OpenAI's GPT-5.3-Codex agent achieved a 72.2% success rate in exploiting vulnerable smart contracts, a significant increase from GPT-5's 31.9% six months prior. While AI demonstrates strong offensive capabilities, its defensive performance in detecting and patching vulnerabilities is lower, with the best model catching only about 46% of bugs. However, providing a small hint can boost patching success from 39% to 94%. OpenAI is addressing these findings by offering $10M in API credits for cybersecurity researchers, expanding its Aardvark AI security research agent, and launching a Trusted Access for Cyber program for vetted security professionals.

Key takeaway

For CTOs and VPs of Engineering overseeing blockchain or crypto initiatives, this report underscores the urgent need to integrate advanced AI-driven security measures. Your teams should prioritize adopting AI tools for smart contract auditing and defense, as AI-powered threats are rapidly evolving. Consider leveraging OpenAI's Aardvark or similar platforms to proactively identify and mitigate vulnerabilities before attackers exploit them, recognizing that offense currently holds an advantage.

Key insights

AI agents, particularly OpenAI's GPT-5.3-Codex, are highly effective at exploiting smart contract vulnerabilities.

Principles

Method

EVMbench evaluates AI agents on their ability to find, fix, and exploit smart contract vulnerabilities, measuring success rates in fund draining and bug detection/patching.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, AI Product Manager, Tech Journalist

Related on AIssential

Open in AIssential โ†’

Editorial summary, takeaway, and curation by AIssential. Original article published by The Neuron.