Cybersecurity Experts Are Unhappy With Anthropic’s New AI

2026-06-10 · Source: AutoGPT · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Intermediate, short

Summary

Anthropic released Fable, a public version of its Mythos AI model, on June 10, 2026, drawing criticism from cybersecurity professionals. Fable, originally developed with cybersecurity in mind and launched under Project Glasswing in April, features guardrails deemed "unnecessarily strict." Users report blocks on innocuous tasks such as reading blog posts, writing secure code, and code reviews, with the system flagging messages for "cybersecurity or biology topics." When a restriction is triggered, Fable silently downgrades to Claude Opus 4.8. Experts like Valentina Palmiotti and Matt Suiche suggest the issue is keyword-driven, hindering legitimate security work. While Anthropic aims to prevent AI-assisted cyberattacks, the current implementation frustrates professionals, though a Cyber Verification Program offers a path to fewer limitations.

Key takeaway

For cybersecurity professionals evaluating new AI models for security-related tasks, Anthropic's Fable presents significant usability challenges due to its overly strict, keyword-driven guardrails. Your team may find legitimate activities, like code reviews or blog post analysis, blocked or silently downgraded to a less capable model. If you require full functionality for security work, you should apply for the Cyber Verification Program, but be prepared for potential delays and the need to navigate these restrictions.

Key insights

Overly strict, keyword-driven AI guardrails can inadvertently block legitimate professional tasks, frustrating users.

Principles

AI safety measures can inadvertently impede legitimate professional use.
Keyword-based content filtering can be overly broad and imprecise.
Gradual relaxation of AI guardrails may be a viable release strategy.

In practice

Apply for Anthropic's Cyber Verification Program.
Anticipate keyword-driven AI content filters.
Be aware of silent model downgrades.

Topics

Anthropic
Fable AI
AI Guardrails
Cybersecurity
Content Filtering
Cyber Verification Program

Best for: CTO, VP of Engineering/Data, AI Architect, AI Security Engineer, AI Scientist, Director of AI/ML

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AutoGPT.