The Context Problem

2025-07-29 · Source: Intentional Arrangement · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Knowledge Engineering & Semantic Technologies · Depth: Advanced, extended

Summary

The AI industry's use of "context" as a billing unit for large language models (LLMs) has led to significant confusion and economic inefficiencies, as the term ambiguously encompasses capacity, process, and quality. Major AI providers like OpenAI, Google, Anthropic, xAI, Meta, and Oracle price LLM APIs by tokens, with costs varying drastically; for example, GPT-5.4 Pro output costs $180 per million tokens, while Gemini 2.5 Flash costs $2.50 per million tokens. Research, including Chroma's 2025 study, indicates that LLM performance degrades ("context rot") as input length increases, even with logically structured data, suggesting that raw token capacity does not equate to semantic coherence. This market dynamic creates a "credence good" scenario where buyers cannot verify the quality of context, leading to misaligned incentives and substantial, often unmanaged, AI spend for organizations.

Key takeaway

For CTOs and AI Engineers managing LLM deployments, recognize that raw token count and context window size do not directly correlate with semantic coherence or model performance. Prioritize investing in structured knowledge architectures, such as knowledge graphs and ontologies, to explicitly represent relationships. This approach reduces token consumption, mitigates "context rot," and improves reasoning quality, ultimately leading to more cost-effective and reliable AI systems despite vendor pricing models.

Key insights

AI's "context" as a billing unit conflates capacity with coherence, leading to inflated costs and degraded model performance.

Principles

Capacity does not guarantee coherence.
More tokens can degrade performance.
Structured knowledge improves signal-to-token ratio.

Method

Context engineering involves curating the smallest set of high-signal tokens to maximize desired outcomes, using techniques like compaction, structured note-taking, and sub-agent architectures to manage context windows.

In practice

Implement neurosymbolic structures like knowledge graphs.
Utilize context compression techniques.
Manage conversation history actively.

Topics

AI Token Economics
LLM Context Windows
Neurosymbolic AI
Context Engineering
Knowledge Graphs

Best for: CTO, VP of Engineering/Data, AI Engineer, AI Architect, AI Product Manager, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Intentional Arrangement.