Structured Context Engineering for File-Native Agentic Systems
Summary
A new paper by Damon McMillan, "Structured Context Engineering for File-Native Agentic Systems," systematically studies context engineering for structured data, using SQL generation as a proxy for programmatic agent operations. The research involved 9,649 experiments across 11 models, 4 data formats (YAML, Markdown, JSON, and TOON), and SQL schemas ranging from 10 to 10,000 tables. Key findings indicate that frontier models like Opus 4.5, GPT-5.2, and Gemini 2.5 Pro significantly outperformed open-source models such as DeepSeek V3.2, Kimi K2, and Llama 4. While frontier models benefited from filesystem-based context retrieval, open-source models showed less convincing results. The study also identified a "grep tax" where models unfamiliar with the TOON format consumed 138% to 740% more tokens than YAML for schema sizes from 500 to 10,000 tables, despite TOON's smaller file size.
Key takeaway
For AI Architects designing agentic systems that interact with large structured datasets, prioritize frontier models (e.g., Opus 4.5, GPT-5.2, Gemini 2.5 Pro) for superior performance, especially with filesystem context retrieval. Be wary of using less common data formats like TOON, as they can incur a significant "grep tax" in token consumption due to model unfamiliarity, leading to increased operational costs and reduced efficiency. Stick to widely recognized formats like YAML or JSON unless you can fine-tune your models.
Key insights
Frontier LLMs excel at structured context tasks, outperforming open-source models, with format familiarity impacting token efficiency.
Principles
- Model capability dictates structured context performance.
- Familiarity with data format reduces token consumption.
Method
The study used SQL generation as a proxy for programmatic agent operations, testing 11 models across 4 formats and schemas up to 10,000 tables.
In practice
- Prioritize frontier models for complex agentic tasks.
- Use common data formats like YAML or JSON.
- Avoid novel formats without model fine-tuning.
Topics
- Structured Context Engineering
- LLM Agentic Systems
- Large SQL Schemas
- Data Format Efficiency
- Frontier LLMs
Code references
Best for: AI Scientist, Research Scientist, AI Architect, AI Researcher, AI Engineer, Prompt Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.