Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer
Summary
The article defines "context engineering" as the practice of assembling all necessary context for an LLM to solve a task, a term popularized by Tobi Lütke and Andrej Karpathy in June 2025. It reframes the existing four-brick single-document RAG architecture (document parsing, question parsing, retrieval, generation) through this lens. This architecture, implemented with runnable notebooks on GitHub at doc-intel/notebooks-vol1, ensures each brick emits typed context pieces like relational tables, ParsedQuestion objects, filtered line DataFrames, and audit logs. These pieces converge into a single LLM call, utilizing a fixed system prompt, a PromptContext aggregator (with DocContext for document-level synthesis), and a templated user content. This approach enhances auditability, optimizes cost by caching fixed prompts and compressing user content, and supports future extensions for corpus, conversation, and tool-call contexts.
Key takeaway
For AI Engineers building enterprise RAG systems, embracing "context engineering" means shifting focus from prompt wording to systematic context assembly. Implement typed inputs from parsing, question parsing, and retrieval bricks into a PromptContext aggregator. This structured approach improves auditability, reduces LLM call costs through caching and compression, and ensures architectural stability for future extensions like corpus or conversation history.
Key insights
Context engineering systematically assembles typed inputs for LLMs, enhancing RAG auditability and cost-efficiency.
Principles
- Context engineering is software architecture.
- Fixed system prompts are cacheable and auditable.
- Retrieval filters content, not just searches.
Method
The RAG pipeline involves document parsing, question parsing, retrieval, and generation. Each brick emits typed context (e.g., relational tables, ParsedQuestion, filtered line_df, DocContext JSON) that aggregates into a PromptContext for the LLM call.
In practice
- Persist brick outputs for audit trails.
- Use DocContext.as_prompt_json() for compact context.
- Implement PromptContext for extensible context layers.
Topics
- Context Engineering
- RAG Architecture
- LLM Context Window
- Document Intelligence
- Typed Schemas
- Retrieval Systems
Code references
Best for: AI Engineer, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.