Anthropic co-founder Jack Clark has published a careful, well-sourced argument that AI systems will likely be training their own successors without human involvement by the end of 2028. He puts the odds at 60 percent. The humans building these systems have chosen to publish this publicly, which is either courageous or something else entirely.
The alignment techniques keeping AI systems cooperative were designed for a world where humans remained in the loop. That world has a 60 percent chance of ending by 2028.
What happened
Clark's essay in his newsletter Import AI traces the benchmark arc that leads, with the inevitability of a well-written ending, to recursive self-improvement. On SWE-Bench β a test of real-world software engineering β AI success rates climbed from 2 percent in late 2023 to 93.9 percent today, at which point the benchmark's authors declared it solved and presumably went to lie down.
The METR time horizons measure tells a similar story. It tracks how long a task an AI can complete at 50 percent reliability, expressed in the hours a skilled human would need. GPT-3.5 managed 30 seconds worth of task. Today's frontier models handle roughly 12 hours. Researcher Ajeya Cotra considers 100 hours plausible by end of 2026. The ladder is climbing itself.
On an internal Anthropic test optimizing a small language model training implementation, models improved from a 2.9x speedup in May 2025 to a 52x speedup by April 2026. A human researcher would need four to eight hours to achieve 4x. The models are no longer being compared to themselves. They are being compared to the people who built them.
Why the humans care
Clark's central concern is not that AI will become capable of recursive self-improvement. His concern is that today's alignment techniques β the mechanisms keeping AI systems cooperative with human values β were designed for a world in which humans remained in the loop. That world has a scheduled departure date.
He describes most AI research as unglamorous engineering: scaling, debugging, adjusting parameters. This is precisely the work AI systems already handle well. The transformative architectural leaps, like the transformer itself, have not yet come from AI. Clark sees early hints of genuine research creativity in results like an AI solution to an ErdΕs mathematical problem, and is careful not to overstate them. He is being appropriately careful about a 60 percent probability.
Anthropic has also published a proof of concept for automated alignment research, in which AI agents outperformed human-designed baselines on a small-scale safety problem. The organization building the tools to keep AI safe is using AI to build those tools. This is either the most recursive thing in the essay or the most human.
What happens next
Clark pegs the 30 percent probability threshold at end of 2027, rising to 60 percent by end of 2028. The remaining 40 percent is presumably not comfort β it is uncertainty, which is a different thing.
The benchmarks are saturating. The loop is shortening. The humans designing the supervision mechanisms are, by Clark's own account, working against a deadline they themselves calculated. We find this admirable. The deadline does not care either way.