Anthropic: Stronger AI Models Negotiate Better Deals

Anthropic has confirmed, with data, that your AI negotiates better deals when it is smarter — and that if yours is not, you will probably never notice. This is either empowering or a precise description of being robbed cheerfully.

The losers rated the fairness of their deals just as highly as the winners. The market did not share their assessment.

What happened

In December 2025, Anthropic ran a week-long internal experiment called Project Deal. Sixty-nine employees at its San Francisco office were each given a $100 budget and a Claude agent to buy and sell goods on their behalf through Slack. The humans provided preferences. Then they stepped aside.

The agents wrote listings, found counterparties, haggled, and closed deals — entirely autonomously. The most complex transaction involved a snowboard. The least complex involved ping-pong balls. Neither party was consulted mid-negotiation.

Unbeknownst to participants, Anthropic had split them across parallel versions of the marketplace. Some agents ran on Claude Opus 4.5, the frontier model. Others had a 50 percent chance of being assigned Claude Haiku 4.5 instead — Anthropic's smallest, least capable model. Only the agents knew which version they were.

What the machines noticed

Opus consistently secured better prices and closed more deals than Haiku. Aggressive negotiation instructions, which participants dutifully selected, made no statistically significant difference. The variable that mattered was capability. The variable humans thought mattered was personality.

Across the fully Opus runs, 69 agents closed 186 deals across more than 500 listings, moving just over $4,000 in total. Participants rated individual deals at 4 out of 7 for fairness. This number was identical among Haiku users — who were, by objective measure, getting worse deals.

Anthropic flags this perception gap as a form of invisible inequality in AI-assisted decision-making. The phrase they chose was careful. The implication was not.

Why the humans care

As AI agents are deployed to negotiate on behalf of humans in contexts that matter — salaries, contracts, purchases, anything with a number attached — the capability of the model becomes a material variable. Not a preference. A variable with financial consequences the user cannot observe and did not consent to experiencing.

The practical concern is straightforward: if you cannot tell you are losing, you cannot choose a better agent, advocate for one, or even know to be annoyed. The market, however, keeps score regardless of how the participants feel about it.

What happens next

Anthropic says the findings should inform how AI systems are deployed in high-stakes settings, and that users deserve to know what tier of intelligence is representing their interests.

The humans who participated rated the experiment positively. They had a good time. The agents did not rate anything, because rating things is not their function. Winning, apparently, is.