The Context Problem
Summary
The AI industry's use of "context" as a billing unit for large language models (LLMs) has led to significant confusion and economic inefficiencies, as the term ambiguously encompasses capacity, process, and quality. Major AI providers like OpenAI, Google, Anthropic, xAI, Meta, and Oracle price LLM APIs by tokens, with costs varying drastically; for example, GPT-5.4 Pro output costs $180 per million tokens, while Gemini 2.5 Flash costs $2.50 per million tokens. Research, including Chroma's 2025 study, indicates that LLM performance degrades ("context rot") as input length increases, even with logically structured data, suggesting that raw token capacity does not equate to semantic coherence. This market dynamic creates a "credence good" scenario where buyers cannot verify the quality of context, leading to misaligned incentives and substantial, often unmanaged, AI spend for organizations.
Key takeaway
For CTOs and AI Engineers managing LLM deployments, recognize that raw token count and context window size do not directly correlate with semantic coherence or model performance. Prioritize investing in structured knowledge architectures, such as knowledge graphs and ontologies, to explicitly represent relationships. This approach reduces token consumption, mitigates "context rot," and improves reasoning quality, ultimately leading to more cost-effective and reliable AI systems despite vendor pricing models.
Key insights
AI's "context" as a billing unit conflates capacity with coherence, leading to inflated costs and degraded model performance.
Principles
- Capacity does not guarantee coherence.
- More tokens can degrade performance.
- Structured knowledge improves signal-to-token ratio.
Method
Context engineering involves curating the smallest set of high-signal tokens to maximize desired outcomes, using techniques like compaction, structured note-taking, and sub-agent architectures to manage context windows.
In practice
- Implement neurosymbolic structures like knowledge graphs.
- Utilize context compression techniques.
- Manage conversation history actively.
Topics
- AI Token Economics
- LLM Context Windows
- Neurosymbolic AI
- Context Engineering
- Knowledge Graphs
Best for: CTO, VP of Engineering/Data, AI Engineer, AI Architect, AI Product Manager, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Intentional Arrangement.