The Agent Stack - Part 5: Context, Retrieval, and Memory
Summary
The article "The Agent Stack - Part 5: Context, Retrieval, and Memory" by Vinoth Govindarajan emphasizes that a model's "context" is not its inherent knowledge but rather the specific, bounded working set assembled by the runtime for each turn. This working set includes instructions, user messages, session history, workflow state, retrieved evidence, durable memories, tool definitions, and outputs. The runtime's "context assembly" process is critical, involving filtering, ranking, summarizing, and formatting inputs to fit the model's context window. The article distinguishes between session history (source material), retrieval (candidate evidence), and memory (durable, owned state with a lifecycle), highlighting that long context windows and prompt caching optimize budget and performance but do not eliminate the need for careful selection, scope, provenance, and lifecycle management of information presented to the model.
Key takeaway
For AI Architects and MLOps Engineers designing agent systems, you must explicitly manage the context assembly layer. Do not confuse raw session history or retrieved data with the model's working set; instead, implement clear policies for information selection, scope, provenance, and memory lifecycle. This approach will prevent common failures like stale evidence or missing memory, ensuring your agents reason over accurate and relevant information.
Key insights
Context is the runtime-assembled working set, not the model's inherent knowledge.
Principles
- Context is derived state, not raw history.
- Retrieval provides evidence, not truth.
- Memory requires explicit lifecycle management.
Method
Context assembly involves filtering, ranking, summarizing, and formatting all possible inputs into a model-visible request payload for each turn, ensuring only relevant and scoped information is presented.
In practice
- Audit the assembled working set for debugging.
- Separate hot-path context assembly from background memory maintenance.
Topics
- Context Assembly
- Agent Runtimes
- Information Retrieval
- Durable Memory
- Session Management
Best for: AI Architect, AI Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Agent Stack.