UK gov's Mythos AI tests help separate cybersecurity threat from hype

2026-04-14 · Source: AI - Ars Technica · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Intermediate, quick

Summary

Anthropic's Mythos Preview model, a new AI system designed for cybersecurity tasks, has undergone an initial evaluation by the UK government's AI Security Institute (AISI). While Mythos Preview shows comparable performance to other frontier models like GPT-5.4 and Anthropic's own Opus 4.6 and Codex 5.3 on individual cybersecurity tasks, its key differentiator is its ability to chain multiple tasks into complex, multistep infiltration attacks. Mythos Preview became the first AI model to successfully complete AISI's "The Last Ones" (TLO) challenge, a 32-step data extraction simulation on a corporate network, succeeding in 3 out of 10 attempts and averaging 22 steps per run, significantly outperforming Claude 4.6's 16-step average. However, the model still struggles with more complex tests like "Cooling Tower," which simulates power plant disruption.

Key takeaway

For cybersecurity leaders evaluating advanced AI models, Mythos Preview's capability to execute multistep infiltration attacks on weakly defended systems signals a critical shift. You should prioritize assessing your enterprise's exposure to chained AI attacks, particularly in less robust network segments. Proactively integrate AI-driven defensive tooling to counter these evolving threats, as future models will likely surpass Mythos's capabilities, necessitating AI-augmented defense strategies.

Key insights

Mythos Preview is the first AI to complete a complex, multistep cyber infiltration challenge.

Principles

AI models can chain tasks for complex attacks.
Simulated environments lack real-world defenses.

Method

AISI uses Capture the Flag (CTF) challenges, including multistep simulations like "The Last Ones" (TLO), to evaluate AI cyberattack capabilities, measuring task completion and infiltration steps.

In practice

Test AI on multistep infiltration scenarios.
Simulate corporate network data extraction.
Utilize AI for defensive hardening.

Topics

Anthropic Mythos Preview
AI Security Institute
Cybersecurity AI
Multistep Infiltration
Capture the Flag Challenges

Best for: CTO, VP of Engineering/Data, Executive, AI Security Engineer, AI Scientist, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI - Ars Technica.