Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer

2026-06-30 · Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, long

Summary

The article defines "context engineering" as the practice of assembling all necessary context for an LLM to solve a task, a term popularized by Tobi Lütke and Andrej Karpathy in June 2025. It reframes the existing four-brick single-document RAG architecture (document parsing, question parsing, retrieval, generation) through this lens. This architecture, implemented with runnable notebooks on GitHub at doc-intel/notebooks-vol1, ensures each brick emits typed context pieces like relational tables, ParsedQuestion objects, filtered line DataFrames, and audit logs. These pieces converge into a single LLM call, utilizing a fixed system prompt, a PromptContext aggregator (with DocContext for document-level synthesis), and a templated user content. This approach enhances auditability, optimizes cost by caching fixed prompts and compressing user content, and supports future extensions for corpus, conversation, and tool-call contexts.

Key takeaway

For AI Engineers building enterprise RAG systems, embracing "context engineering" means shifting focus from prompt wording to systematic context assembly. Implement typed inputs from parsing, question parsing, and retrieval bricks into a PromptContext aggregator. This structured approach improves auditability, reduces LLM call costs through caching and compression, and ensures architectural stability for future extensions like corpus or conversation history.

Key insights

Context engineering systematically assembles typed inputs for LLMs, enhancing RAG auditability and cost-efficiency.

Principles

Context engineering is software architecture.
Fixed system prompts are cacheable and auditable.
Retrieval filters content, not just searches.

Method

The RAG pipeline involves document parsing, question parsing, retrieval, and generation. Each brick emits typed context (e.g., relational tables, ParsedQuestion, filtered line_df, DocContext JSON) that aggregates into a PromptContext for the LLM call.

In practice

Persist brick outputs for audit trails.
Use DocContext.as_prompt_json() for compact context.
Implement PromptContext for extensible context layers.

Topics

Context Engineering
RAG Architecture
LLM Context Window
Document Intelligence
Typed Schemas
Retrieval Systems

Code references

Best for: AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.