What you'll pay for AI agents will be wildly variable and unpredictable

· Source: News and Advice on the World's Latest Innovations | ZDNET · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Emerging Technologies & Innovation · Depth: Intermediate, medium

Summary

A new study by the University of Michigan and collaborators, including Stanford University, Google's DeepMind, Microsoft, and MIT, reveals that AI agents incur significantly higher and unpredictable token costs compared to simple prompt-based chats. The study, titled "How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks," found agents can consume up to 3,500 times more tokens for a task. Furthermore, token consumption varies wildly between different models and even for the same model on identical tasks, with some runs using twice as many tokens. Agents also consistently underestimate their token needs, particularly for input tokens which dominate costs due to repeated context feeding and cache reads. This unpredictability and lack of correlation between token usage and performance pose significant challenges for cost estimation and enterprise adoption.

Key takeaway

For CTOs and VPs of Engineering evaluating AI agent deployments, recognize that current vendor pricing models do not reflect the true, highly variable, and often excessive operational costs. You must demand greater price transparency and performance guarantees from AI providers to mitigate significant budget overruns and ensure task completion, or risk unstable and costly implementations.

Key insights

AI agents incur vastly higher and unpredictable token costs, primarily driven by input tokens and cache reads, without guaranteeing improved performance.

Principles

Method

The study used the OpenHands framework to build agents, testing them on the SWE-Bench coding benchmark, which involves tasks derived from GitHub issues, to analyze token consumption.

In practice

Topics

Code references

Best for: CTO, VP of Engineering/Data, AI Architect, AI Engineer, Director of AI/ML, Consultant

Related on AIssential

Counsel's verdict on this

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by News and Advice on the World's Latest Innovations | ZDNET.