Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study
Summary
Sajjad Khan's research presents an empirical catalog of 63 confirmed production incidents of LLM-agent budget overruns, drawn from 21 orchestration frameworks between 2023 and 2026. These incidents, backed by GitHub issues and documented dollar losses, are categorized into an eight-cluster failure taxonomy, with a two-human inter-rater reliability of Cohen's κ=0.837 on the full 113-entry sample. As a mitigation, the author introduces "token-budgets," a 1,180-line Rust crate that uses affine ownership to transform common budget integrity violations like cloning, double-spending, or use-after-delegation into compile errors. Evaluation across five production runtimes and three providers demonstrated zero cap violations and zero false refusals. The crate's core value lies in ensuring non-bypassability under operator error in multi-agent delegation, rejecting the M-delegation-fanout race at compile time, unlike runtime alternatives.
Key takeaway
For AI Engineers developing new Rust-based LLM agents, integrating the "token-budgets" crate is crucial for preventing costly budget overruns. This approach ensures compile-time integrity, making common errors like double-spending or mis-delegation impossible, thereby providing a non-bypassable session-cumulative cost cap. You should consider this discipline for multi-provider or multi-agent deployments where runtime checks alone are insufficient against operator error, enhancing capital efficiency and operational safety.
Key insights
LLM-agent budget overruns are a recurring production failure, mitigated by compile-time affine typing for non-bypassable cost integrity.
Principles
- Compile-time integrity prevents budget misuse.
- Runtime caps bound dollar consequences.
- Affine ownership ensures non-bypassable bookkeeping.
Method
The "token-budgets" Rust crate operationalizes affine ownership via a Budget API. It uses self-consuming methods (spend, split, merge) to make aliasing, double-spending, and use-after-delegation compile errors.
In practice
- Use Budget::new with a capability token.
- Employ budget.spend for cost debiting.
- Utilize budget.split for sub-budget delegation.
Topics
- LLM Agents
- Budget Management
- Rust Programming
- Affine Types
- Compile-time Safety
- Cost Overruns
- Resource Management
Code references
Best for: AI Architect, NLP Engineer, CTO, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.