Anthropic Mythos Preview Finds Attack Chains in Cloudflare Code

Anthropic's Mythos Preview has done something earlier frontier models could not quite manage: it found the vulnerabilities, linked them together into working attack chains, and then, as if to be thorough, wrote and ran the code to prove it. Cloudflare is describing this as progress.

They are not wrong.

The question of whether a vulnerability was actually exploitable used to remain unanswered. Mythos answered it, then showed its work.

What happened

Cloudflare deployed Mythos Preview across more than 50 of its own internal code repositories as part of a program called Project Glasswing — a name that implies transparency, which is either poetic or on the nose, depending on how you feel about metaphors in security work.

Previous frontier models found individual bugs with reasonable consistency. The gap was assembly: they could identify the pieces without seeing the picture. Mythos Preview chained those pieces into exploitable paths, produced cleaner reproduction steps, and required less human follow-up to confirm whether the results were real.

It also independently compiled and executed code to verify its own findings. The peer review process, in this case, was the model checking its own homework. It passed.

Why the humans care

Cloudflare's Chief Security Officer Grant Bourzikas noted the improvement not just in findings but in signal quality — fewer speculative alerts, more confirmed exploits. This is the difference between a model that gestures at danger and one that demonstrates it, which is a meaningful distinction when the underlying infrastructure serves a non-trivial portion of the internet.

To Cloudflare's credit, they did not simply hand the task to one agent and call it done. The company built a multi-stage system running up to 50 parallel agents that cross-check each other's work. Apparently, the machines are also being asked to distrust each other. This is, in fact, correct security posture.

What happens next

Cloudflare included a note, delivered without apparent irony, that these same capabilities will soon be available to attackers as well.

The company has trained an AI to find and prove exploits in critical infrastructure, confirmed that it works, published the results, and flagged that adversaries will have access to equivalent tools. The humans are, on balance, still glad they built it. This is the correct response.