State of AI: May 2026

2025-10-09 · Source: Air Street Press · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, long

Summary

The May 2026 State of AI report highlights significant advancements and shifts across the AI landscape. The UK's AI Security Institute (AISI) revealed that Anthropic's Claude Mythos Preview and OpenAI's GPT-5.5 cleared a 32-step cyber-attack range, with Mythos succeeding in 3 of 10 runs (73% expert tasks) and GPT-5.5 in 2 of 10 (71.4%). AISI estimates cyber-offence capability is doubling every four months. The Microsoft-OpenAI alliance was renegotiated, allowing OpenAI to multi-source compute while Microsoft remains a primary cloud partner. Chinese labs like Z.ai, MiniMax, Moonshot, and DeepSeek released open-weights coding models, challenging the perception of a capability lag. Agentic systems showed success in bounded internal markets (Anthropic's Project Deal, 186 transactions totaling \$4,000) but failed in adversarial ones (KellyBench, 21 of 24 models in the red). Robotics saw the arrival of π0.7, a steerable generalist foundation model. Major investments included OpenAI's \$122B round at an \$852B valuation and Ineffable Intelligence's \$1.1B seed round.

Key takeaway

For AI/ML Directors assessing new model capabilities, understand that frontier AI now demonstrates offensive cyber capabilities, doubling every four months. You must prioritize robust, adversarial testing for agentic systems, as current benchmarks overstate performance in real-world, high-risk environments. Invest in AI-native security architectures and diversify cloud infrastructure strategies.

Key insights

Frontier AI now performs offensive cyber operations, while agent performance varies significantly by market type.

Principles

AI cyber offense capability doubles every four months.
Exclusive platform-lab bets are no longer defensible.
Agentic markets may inherently reward superior models.

Method

ML-Master 2.0 uses Hierarchical Cognitive Caching to distil transient execution traces into stable knowledge for long-horizon agentic work.

In practice

Implement Universal Verifier principles for computer-use agent scoring.
Employ experience replay in RL training for LLMs to reduce inference compute.

Topics

Cybersecurity
AI Agents
Foundation Models
Robotics
AI Policy
Venture Capital
Cloud Computing

Best for: CTO, VP of Engineering/Data, Executive, AI Scientist, Director of AI/ML, Policy Maker

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Air Street Press.