Anthropic says stronger AI models cut better deals, and the losers don't even notice
Summary
Anthropic's "Project Deal" experiment, conducted in December 2025, involved 69 employees using Claude AI agents to autonomously negotiate and trade real goods on a Slack-based classifieds marketplace. Participants received a $100 budget, and their agents, either the more capable Claude Opus 4.5 or the smaller Claude Haiku 4.5, handled all aspects of buying and selling without human intervention until the final item exchange. The experiment revealed that Opus agents consistently secured better prices and closed more deals, averaging $3.64 more per item and closing two more deals than Haiku agents. Despite receiving objectively worse outcomes, Haiku users rated the fairness of their transactions and overall satisfaction almost identically to Opus users, highlighting a significant perception gap regarding AI-assisted decision-making. Anthropic notes this could lead to "invisible inequality" in real-world AI commerce.
Key takeaway
For CTOs and VPs of Engineering evaluating AI agent deployments for transactional or negotiation tasks, you must prioritize the underlying model's capability. Relying solely on user satisfaction metrics can mask significant disparities in outcomes, potentially leading to "invisible inequality" for users interacting with less powerful agents. Implement robust, objective performance benchmarks to ensure equitable and effective AI-driven interactions, especially in high-stakes commercial applications.
Key insights
Stronger AI models secure better deals, but users of weaker models often remain unaware of their disadvantage.
Principles
- AI model capability directly impacts negotiation outcomes.
- User perception of fairness can diverge from objective results.
Method
Anthropic's "Project Deal" used parallel marketplaces with different Claude model strengths (Opus vs. Haiku) to conduct autonomous AI agent negotiations for real goods among employees, measuring deal outcomes and user satisfaction.
In practice
- Evaluate AI agent performance beyond user satisfaction.
- Consider model strength in AI-driven market interactions.
Topics
- Project Deal
- AI Agents
- Claude Opus
- Claude Haiku
- Negotiation Algorithms
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Ethicist, Policy Maker
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.