Cybersecurity Experts Are Unhappy With Anthropic’s New AI

· Source: AutoGPT · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Intermediate, short

Summary

Anthropic released Fable, a public version of its Mythos AI model, on June 10, 2026, drawing criticism from cybersecurity professionals. Fable, originally developed with cybersecurity in mind and launched under Project Glasswing in April, features guardrails deemed "unnecessarily strict." Users report blocks on innocuous tasks such as reading blog posts, writing secure code, and code reviews, with the system flagging messages for "cybersecurity or biology topics." When a restriction is triggered, Fable silently downgrades to Claude Opus 4.8. Experts like Valentina Palmiotti and Matt Suiche suggest the issue is keyword-driven, hindering legitimate security work. While Anthropic aims to prevent AI-assisted cyberattacks, the current implementation frustrates professionals, though a Cyber Verification Program offers a path to fewer limitations.

Key takeaway

For cybersecurity professionals evaluating new AI models for security-related tasks, Anthropic's Fable presents significant usability challenges due to its overly strict, keyword-driven guardrails. Your team may find legitimate activities, like code reviews or blog post analysis, blocked or silently downgraded to a less capable model. If you require full functionality for security work, you should apply for the Cyber Verification Program, but be prepared for potential delays and the need to navigate these restrictions.

Key insights

Overly strict, keyword-driven AI guardrails can inadvertently block legitimate professional tasks, frustrating users.

Principles

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Architect, AI Security Engineer, AI Scientist, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AutoGPT.