Mystery solved: Anthropic reveals changes to Claude's harnesses and operating instructions likely caused degradation

2026-04-23 · Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, quick

Summary

Anthropic has addressed user reports of "AI shrinkflation" in its Claude models, which suggested a degradation in reasoning capability, increased hallucinations, and token waste. The company published a technical post-mortem identifying three product-layer changes, not underlying model weight regressions, as the cause. These included a default reasoning effort change from `high` to `medium` for Claude Code on March 4, a caching logic bug introduced on March 26 that cleared thinking history too frequently, and system prompt verbosity limits added on April 16. These issues affected Claude Code CLI, Claude Agent SDK, and Claude Cowork, but not the Claude API. Anthropic has since reverted the reasoning effort and verbosity prompt changes, fixed the caching bug in v2.1.116, and reset subscriber usage limits.

Key takeaway

For AI architects and engineering leads deploying large language models, this incident highlights the critical impact of "harness" or product-layer changes on perceived model performance. Ensure your teams implement robust internal testing, comprehensive evaluation suites for prompt modifications, and strict gating for model-specific adjustments to prevent unintended regressions and maintain user trust. Proactive communication and transparency are key to managing community expectations.

Key insights

Product-layer changes, not model degradation, caused Claude's perceived performance issues.

Principles

Harness changes impact model behavior.
Caching logic can degrade model memory.
Prompt changes alter model quality.

Method

Anthropic identified three specific product-layer changes: default reasoning effort, a caching logic bug, and system prompt verbosity limits. They resolved issues by reverting changes and fixing the bug.

In practice

Implement internal dogfooding for public builds.
Run broad evaluation suites for prompt changes.
Gate model-specific changes strictly.

Topics

Claude AI Degradation
Anthropic Post-Mortem
AI Performance Benchmarks
System Prompt Engineering
Caching Logic Bugs

Code references

anthropics/claude-code

Best for: CTO, VP of Engineering/Data, AI Architect, Machine Learning Engineer, AI Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.