An update on recent Claude Code quality reports

· Source: Simon Willison's Weblog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Intermediate, quick

Summary

Anthropic has confirmed that a high volume of user complaints regarding degraded code quality from Claude Code over the past two months were valid. The issue stemmed not from the underlying models, but from three distinct problems within the Claude Code harness. One significant bug, introduced on March 26, involved a change intended to clear older session thinking after an hour of idleness to reduce latency. However, a defect caused this clearing to occur repeatedly during subsequent turns in the session, leading Claude to appear forgetful and repetitive to users. This particularly impacted users who frequently return to long-idle sessions, potentially affecting a substantial portion of user interactions.

Key takeaway

For AI Architects designing agentic systems, you should prioritize rigorous testing of your system's harness and session management logic. Complex interactions, like those involving session state and idle timeouts, can introduce subtle yet impactful bugs that mimic model performance issues. Ensure your testing protocols include scenarios for long-running, intermittently used sessions to prevent similar degradations in user experience.

Key insights

Harness-level bugs, not model issues, caused Claude Code's recent performance degradation.

Principles

In practice

Topics

Best for: AI Architect, AI Engineer, MLOps Engineer, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.