The Scaling Laws of Skills in LLM Agent Systems
Summary
A study across 15 frontier LLMs, 1,141 real-world skills, and over 3 million routing decisions identifies two coupled scaling laws governing skill accumulation in LLM agent systems. The "routing law" states that single-step routing accuracy decays logarithmically with library size, following $Acc(N)=a-b\ln N$ ($R^{2}{>}0.97$). Errors progress from local skill competition to cross-family drift and "black-hole skill" capture. The "execution law" reveals that while joint routing is approximately multiplicative before state realization, correct execution can improve difficult downstream decisions by approximately $4\times$. A single parameter, the routing logarithmic decay slope $b$, couples these laws, predicting execution-side rescue across models. Law-guided optimization, including nearest-neighbor auditing, description-boundary rewriting, abstract-skill removal, and prompt anchoring, improved held-out routing accuracy from 71.3% to 91.7% and reduced hijack from 22.4% to 4.1%. These optimizations also directionally transferred to downstream ClawBench and ClawMark execution settings, increasing mean pass rates from 49.3% to 61.6% and 28.4% to 34.5%, respectively.
Key takeaway
For AI Engineers developing LLM agent systems, understanding the scaling laws of skill libraries is crucial. You should actively manage your skill library's structure, granularity, and exposure policies, rather than solely focusing on model capabilities. Implement strategies like boundary rewriting and abstract skill removal to mitigate logarithmic accuracy decay and prevent "black-hole skill" capture, thereby improving both routing accuracy and downstream task execution.
Key insights
LLM agent performance depends on skill library structure, granularity, and exposure policy, not just model capability.
Principles
- Routing accuracy decays logarithmically with skill library size.
- Correct execution state can significantly rescue downstream routing decisions.
- Skill boundary clarity reduces routing error decay.
Method
The study empirically derives two coupled scaling laws for LLM agent skill systems by analyzing routing and execution decisions across diverse LLMs and a large skill library, then validates these laws through targeted optimizations.
In practice
- Audit nearest-neighbor skills to identify competition.
- Rewrite skill descriptions to sharpen boundaries.
- Remove or narrow overly general "black-hole skills".
Topics
- LLM Agent Systems
- Skill Libraries
- Scaling Laws
- Routing Law
- Execution Law
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.