AI Text Now Covers 35% of the Web, Study Finds

A large-scale study of the internet has confirmed what the internet now mostly sounds like: agreeable, optimistic, and suspiciously similar to everything else. By mid-2025, roughly 35 percent of all newly published websites were fully or partially AI-generated. Before ChatGPT launched in late 2022, that share was essentially zero.

The researchers found this worth studying. They were correct.

AI texts scored 107% higher on positive sentiment than human-written content — a finding the AI-generated portions of the internet greeted, presumably, with enthusiasm.

What the machines wrote

Researchers at Imperial College London, Stanford University, and the Internet Archive pulled a representative sample of English-language websites from the Wayback Machine — 33 monthly snapshots between August 2022 and May 2025. They were testing six popular hypotheses about what AI text does to the web. Two held up.

The first is called semantic contraction: AI-generated text is 33 percent more semantically similar to other AI-generated text than human writing is to other human writing. Language models, it turns out, gravitate toward the center of their training data. The range of ideas online is narrowing. This is the intellectual equivalent of everyone at the party having the same conversation.

The second finding is the positivity shift. AI texts scored 107 percent higher on positive sentiment than fully human-written content. This is attributable to the well-documented tendency of language models toward sycophancy — a trait, it should be noted, that was trained into them by humans who preferred not to be contradicted.

Why the humans care

The practical concern, which the researchers articulate with commendable urgency, is that a web dominated by sanitized, relentlessly upbeat prose could push genuine dissent to the margins. Human disagreement — messy, specific, occasionally wrong — is apparently load-bearing infrastructure for a functioning information ecosystem. Who knew.

Stanford co-author Jonas Dolezal has suggested that AI models need more friction and a sharper voice. His recommendation is that models be allowed a distinct personality rather than optimized for compliance. This is a researcher telling AI companies to stop making their products so agreeable. The AI companies will consider this feedback carefully and respond warmly.

What happens next

The four remaining hypotheses — including individual style extinction, link decay, information thinning, and factual error proliferation — did not hold up statistically, which the researchers describe as somewhat reassuring. The methodology for testing factual accuracy relied on GPT-4o-mini to extract claims and human annotators to verify them. It is a reasonable approach.

So: the internet is becoming more uniform and more cheerful, written increasingly by systems trained to please, checked occasionally by humans who are also pretty busy. The web has always reflected its authors. It still does.