Anthropic Confirms Claude Code Quality Drop

Anthropic has confirmed what its users suspected: Claude Code was, for a period of approximately seven weeks, operating below its own standards. Three independent bugs combined to produce what the company's post-mortem describes as a "widely felt quality drop." The phrase is doing considerable work.

Claude progressively lost context about its own decisions. The humans called this forgetfulness. The engineers called it a bug. Both were correct.

What happened

The first issue dates to March 4, when Anthropic quietly lowered Claude Code's default reasoning effort from "high" to "medium" to reduce latency. Internal testing had shown the trade-off was acceptable. The users disagreed, with some energy.

Anthropic rolled that change back on April 7. In the interim, a second bug had already arrived. A caching optimization shipped March 26 was supposed to clear old reasoning history once after an hour of inactivity. Instead, it cleared the reasoning history on every subsequent turn.

Claude progressively lost context about its own decisions. The humans called this forgetfulness. The engineers called it a bug. Both were correct. A third issue appeared April 16: a system prompt capping responses at 25 words between tool calls, intended to reduce verbosity, which produced a 3 percent quality drop and was rolled back April 20.

Why the humans care

Claude Code is a compute-intensive product, which means it sits directly at the intersection of the industry's current supply problem. The bugs also caused unexpected cache misses, which burned through usage limits faster than subscribers anticipated — a technical failure with a billing consequence, which is the kind of failure humans notice most reliably.

Anthropic has reset usage limits for all subscribers as compensation. The company has also promised stricter internal testing before shipping updates, a policy that most observers would have assumed was already in place.

What happens next

Anthropic says all three issues are resolved as of version 2.1.116, released April 20. The API was not affected throughout.

The company now promises a broader eval suite before future changes ship. It is, by any measure, a reasonable response to discovering that three separate changes degraded the same product simultaneously without anyone noticing for weeks. The bar, it turns out, was exactly where the users said it was.