What Happens When Anthropic's Mythos Class Models go Public?

· Source: AI Magazine · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Emerging Technologies & Innovation · Depth: Advanced, short

Summary

Anthropic has released Claude Fable 5, a Mythos-class model, making its most capable system generally available. Fable 5 significantly improved SWE-bench Pro scores from 69.2 (Opus 4.8) to 80.3, demonstrating extended autonomy for multi-day projects. Concurrently, Claude Mythos 5, described as the world's most powerful AI cybersecurity model, launched through Project Glasswing in collaboration with the US Government. Mythos 5 shares Fable 5's base model but with fewer safeguards, enabling agentic hacking capabilities for defensive purposes. Both models are priced at \$10 per million input tokens and \$50 per million output tokens, offering applications in scientific research and software engineering. Anthropic implemented safeguards for Fable 5, redirecting risky queries (e.g., offensive cyber, bioweapons) to Opus 4.8, affecting under 5% of sessions, and expanded biological research restrictions. Industry responses are mixed, with some experts like Illumio CEO Andrew Rubin questioning the efficacy of interface-level safeguards against sophisticated attackers.

Key takeaway

For Directors of AI/ML evaluating advanced model deployments, Anthropic's Fable 5 and Mythos 5 present powerful, autonomous capabilities for software engineering, scientific research, and cybersecurity. You should assess their \$10/\$50 per million token pricing against your project's complexity and security needs. Be aware that while safeguards are implemented, industry experts question their full efficacy against sophisticated threats, necessitating robust internal security strategies alongside model deployment.

Key insights

Anthropic's Mythos-class models, Fable 5 and Mythos 5, offer advanced autonomous capabilities for complex tasks, balanced with evolving safety protocols.

Principles

Method

Anthropic employs classifiers to identify and redirect dangerous queries (e.g., offensive cyber, bioweapons, model distillation) to less capable models like Opus 4.8, and extensively red-teams for jailbreaks.

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Engineer, AI Scientist, AI Security Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Magazine.