What you'll pay for AI agents will be wildly variable and unpredictable
Summary
A new study by the University of Michigan and collaborators, including Stanford University, Google's DeepMind, Microsoft, and MIT, reveals that AI agents incur significantly higher and unpredictable token costs compared to simple prompt-based chats. The study, titled "How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks," found agents can consume up to 3,500 times more tokens for a task. Furthermore, token consumption varies wildly between different models and even for the same model on identical tasks, with some runs using twice as many tokens. Agents also consistently underestimate their token needs, particularly for input tokens which dominate costs due to repeated context feeding and cache reads. This unpredictability and lack of correlation between token usage and performance pose significant challenges for cost estimation and enterprise adoption.
Key takeaway
For CTOs and VPs of Engineering evaluating AI agent deployments, recognize that current vendor pricing models do not reflect the true, highly variable, and often excessive operational costs. You must demand greater price transparency and performance guarantees from AI providers to mitigate significant budget overruns and ensure task completion, or risk unstable and costly implementations.
Key insights
AI agents incur vastly higher and unpredictable token costs, primarily driven by input tokens and cache reads, without guaranteeing improved performance.
Principles
- Agentic tasks are uniquely expensive.
- Scaling token usage does not guarantee higher performance.
- Models systematically underestimate token needs.
Method
The study used the OpenHands framework to build agents, testing them on the SWE-Bench coding benchmark, which involves tasks derived from GitHub issues, to analyze token consumption.
In practice
- Set hard limits on agentic computer use.
- Control prompt size and context window width.
- Minimize tool calls by agents to reduce input tokens.
Topics
- AI Agent Costs
- Token Consumption
- Cost Variability
- Token Estimation
- AI Pricing Transparency
Code references
Best for: CTO, VP of Engineering/Data, AI Architect, AI Engineer, Director of AI/ML, Consultant
Related on AIssential
Counsel's verdict on this
AIssential's Counsel cites this article in its editorial verdict on the decision it informs:
Editorial summary, takeaway, and curation by AIssential. Original article published by News and Advice on the World's Latest Innovations | ZDNET.