The Pulse: token spend breaks budgets – what next?
Summary
Token spend for AI tools has surged by approximately 10x in the last six months across companies of all sizes, prompting leadership concerns about sustainability. A survey of developers at 15 businesses reveals two main strategies for managing this increase: "let it rip and start measuring" or "curb spending." Large companies are observing costs "off the charts," with some developers spending up to $500 daily on tools like Claude Code, leading to doubled employee costs and bottlenecks in human code reviews. Mid-sized firms are implementing model routing to cheaper defaults, considering pooled spend models, or factoring AI costs into overall engineering expenses. Smaller companies are exploring options like increasing budgets while measuring ROI, optimizing token consumption, integrating more AI providers, or pivoting to local models. Discounts from vendors like Cursor are available for spending above a few million dollars, but Anthropic currently offers no discounts even for $5M+ annual spend.
Key takeaway
For CTOs and VPs of Engineering grappling with soaring AI token costs, your strategy should balance immediate productivity gains with long-term financial sustainability. Consider implementing a "let it rip and measure" approach initially to capture momentum and quantify ROI, while simultaneously exploring model routing to cheaper defaults for less demanding tasks. Be prepared to negotiate custom discounts with vendors once your spend reaches significant thresholds, and continuously monitor usage to prevent "stupid overspend" without stifling innovation.
Key insights
Rapidly escalating AI token costs are forcing companies to re-evaluate usage strategies and measure productivity impacts.
Principles
- Prioritize measuring AI impact over immediate cost-cutting.
- Default to cheaper models for routine tasks.
- High AI spend can significantly increase engineering output.
Method
Companies are either allowing high AI spend while implementing robust measurement of impact, or actively curbing costs through model selection, usage limits, and exploring local models for long-term control.
In practice
- Implement model routing to cheaper defaults.
- Negotiate custom discounts with AI vendors.
- Track AI adoption and its impact on productivity.
Topics
- AI Token Spend
- Large Language Model Costs
- Developer Productivity
- AI Cost Management
- Claude Sonnet
Best for: CTO, Entrepreneur, VP of Engineering/Data, Software Engineer, AI Engineer, Director of AI/ML
Related on AIssential
Counsel's verdict on this
AIssential's Counsel cites this article in its editorial verdict on the decision it informs:
Editorial summary, takeaway, and curation by AIssential. Original article published by The Pragmatic Engineer.