State of AI: May 2026
Summary
The May 2026 State of AI report highlights significant advancements and shifts across the AI landscape. The UK's AI Security Institute (AISI) revealed that Anthropic's Claude Mythos Preview and OpenAI's GPT-5.5 cleared a 32-step cyber-attack range, with Mythos succeeding in 3 of 10 runs (73% expert tasks) and GPT-5.5 in 2 of 10 (71.4%). AISI estimates cyber-offence capability is doubling every four months. The Microsoft-OpenAI alliance was renegotiated, allowing OpenAI to multi-source compute while Microsoft remains a primary cloud partner. Chinese labs like Z.ai, MiniMax, Moonshot, and DeepSeek released open-weights coding models, challenging the perception of a capability lag. Agentic systems showed success in bounded internal markets (Anthropic's Project Deal, 186 transactions totaling \$4,000) but failed in adversarial ones (KellyBench, 21 of 24 models in the red). Robotics saw the arrival of π0.7, a steerable generalist foundation model. Major investments included OpenAI's \$122B round at an \$852B valuation and Ineffable Intelligence's \$1.1B seed round.
Key takeaway
For AI/ML Directors assessing new model capabilities, understand that frontier AI now demonstrates offensive cyber capabilities, doubling every four months. You must prioritize robust, adversarial testing for agentic systems, as current benchmarks overstate performance in real-world, high-risk environments. Invest in AI-native security architectures and diversify cloud infrastructure strategies.
Key insights
Frontier AI now performs offensive cyber operations, while agent performance varies significantly by market type.
Principles
- AI cyber offense capability doubles every four months.
- Exclusive platform-lab bets are no longer defensible.
- Agentic markets may inherently reward superior models.
Method
ML-Master 2.0 uses Hierarchical Cognitive Caching to distil transient execution traces into stable knowledge for long-horizon agentic work.
In practice
- Implement Universal Verifier principles for computer-use agent scoring.
- Employ experience replay in RL training for LLMs to reduce inference compute.
Topics
- Cybersecurity
- AI Agents
- Foundation Models
- Robotics
- AI Policy
- Venture Capital
- Cloud Computing
Best for: CTO, VP of Engineering/Data, Executive, AI Scientist, Director of AI/ML, Policy Maker
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Air Street Press.