The Agent Stack - Part 5: Context, Retrieval, and Memory

2026-02-17 · Source: The Agent Stack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, long

Summary

The article "The Agent Stack - Part 5: Context, Retrieval, and Memory" by Vinoth Govindarajan emphasizes that a model's "context" is not its inherent knowledge but rather the specific, bounded working set assembled by the runtime for each turn. This working set includes instructions, user messages, session history, workflow state, retrieved evidence, durable memories, tool definitions, and outputs. The runtime's "context assembly" process is critical, involving filtering, ranking, summarizing, and formatting inputs to fit the model's context window. The article distinguishes between session history (source material), retrieval (candidate evidence), and memory (durable, owned state with a lifecycle), highlighting that long context windows and prompt caching optimize budget and performance but do not eliminate the need for careful selection, scope, provenance, and lifecycle management of information presented to the model.

Key takeaway

For AI Architects and MLOps Engineers designing agent systems, you must explicitly manage the context assembly layer. Do not confuse raw session history or retrieved data with the model's working set; instead, implement clear policies for information selection, scope, provenance, and memory lifecycle. This approach will prevent common failures like stale evidence or missing memory, ensuring your agents reason over accurate and relevant information.

Key insights

Context is the runtime-assembled working set, not the model's inherent knowledge.

Principles

Context is derived state, not raw history.
Retrieval provides evidence, not truth.
Memory requires explicit lifecycle management.

Method

Context assembly involves filtering, ranking, summarizing, and formatting all possible inputs into a model-visible request payload for each turn, ensuring only relevant and scoped information is presented.

In practice

Audit the assembled working set for debugging.
Separate hot-path context assembly from background memory maintenance.

Topics

Context Assembly
Agent Runtimes
Information Retrieval
Durable Memory
Session Management

Best for: AI Architect, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Agent Stack.