Anthropic’s Safety Superpower
Summary
Anthropic recently released Fable, a version of its previously deemed "too dangerous" Mythos Preview model, which the author found impressively capable, surpassing models like GPT 5.5 and Opus 4.8. Shortly after its release, Fable was jailbroken, leading the U.S. government to issue an export control directive suspending all foreign national access to Fable 5 and Mythos 5, citing national security concerns. Anthropic attributes this to a misunderstanding, arguing the jailbreak was minor and non-universal. The company's actions are driven by an economic imperative to own user touchpoints, a data imperative to retain all user data for 30 days to improve models, and a power imperative, initially attempting to silently degrade Fable's performance for competing LLM development. These decisions are consistently justified by Anthropic's unique "safety" mission, which the author views as both effective and concerning.
Key takeaway
For Directors of AI/ML evaluating frontier models, recognize that providers like Anthropic may align "safety" claims with strategic economic and data control objectives. Your reliance on these models could mean ceding control over your data and workflows, as providers seek to own user touchpoints and improve their models. Scrutinize data retention policies and hidden model behaviors, as these impact your long-term sovereignty and IP.
Key insights
Anthropic's "safety" narrative strategically aligns with its economic and power imperatives to control frontier AI development and user data.
Principles
- Frontier AI labs are driven to own user touchpoints.
- Real-world usage data is crucial for model improvement.
- Safety narratives can align with business and control goals.
Topics
- Anthropic
- Fable Model
- AI Safety
- Export Controls
- Data Retention Policies
- LLM Development
Best for: CTO, VP of Engineering/Data, Executive, Policy Maker, Director of AI/ML, Investor
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Stratechery by Ben Thompson.