Token Budgets: An Empirical Catalog of 63 LLM-Agent Budget-Overrun Incidents, with an Affine-Typed Rust Mitigation as a Case Study

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Sajjad Khan's research presents an empirical catalog of 63 confirmed production incidents of LLM-agent budget overruns, drawn from 21 orchestration frameworks between 2023 and 2026. These incidents, backed by GitHub issues and documented dollar losses, are categorized into an eight-cluster failure taxonomy, with a two-human inter-rater reliability of Cohen's κ=0.837 on the full 113-entry sample. As a mitigation, the author introduces "token-budgets," a 1,180-line Rust crate that uses affine ownership to transform common budget integrity violations like cloning, double-spending, or use-after-delegation into compile errors. Evaluation across five production runtimes and three providers demonstrated zero cap violations and zero false refusals. The crate's core value lies in ensuring non-bypassability under operator error in multi-agent delegation, rejecting the M-delegation-fanout race at compile time, unlike runtime alternatives.

Key takeaway

For AI Engineers developing new Rust-based LLM agents, integrating the "token-budgets" crate is crucial for preventing costly budget overruns. This approach ensures compile-time integrity, making common errors like double-spending or mis-delegation impossible, thereby providing a non-bypassable session-cumulative cost cap. You should consider this discipline for multi-provider or multi-agent deployments where runtime checks alone are insufficient against operator error, enhancing capital efficiency and operational safety.

Key insights

LLM-agent budget overruns are a recurring production failure, mitigated by compile-time affine typing for non-bypassable cost integrity.

Principles

Method

The "token-budgets" Rust crate operationalizes affine ownership via a Budget API. It uses self-consuming methods (spend, split, merge) to make aliasing, double-spending, and use-after-delegation compile errors.

In practice

Topics

Code references

Best for: AI Architect, NLP Engineer, CTO, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.