Retrieval Is Filtering, Not Search: A Mental Model for Enterprise RAG
Summary
This article, part of the "Enterprise Document Intelligence" series, introduces a mental model for Retrieval Augmented Generation (RAG) systems, asserting that retrieval should be treated as filtering structured tables rather than free-text search. It posits that after a parsing brick generates `line_df` (dense text content) and `toc_df` (sparse table of contents), retrieval becomes a filtering problem on these DataFrames. The core concept differentiates "anchor" (the small, precise unit where a signal is detected, like a line or title) from "context" (the larger, sufficient chunk passed to the LLM, such as a paragraph or entire section). The article demonstrates how this approach, exemplified using "Attention Is All You Need" (Vaswani et al. 2017), improves upon naive RAG baselines by enabling precise filtering, table joins, and targeted LLM calls, mirroring how human experts navigate documents.
Key takeaway
For MLOps Engineers building enterprise RAG systems, recognize that effective retrieval hinges on treating documents as structured tables, not flat text. You should design your pipeline to explicitly separate the small, precise "anchor" where a match is found from the larger "context" provided to the LLM. This approach, leveraging `line_df` and `toc_df` from parsing, allows for more accurate filtering and context sizing, significantly improving answer quality for diverse question types compared to simple vector search. Prioritize using existing LLM capabilities for complex tasks like section boundary detection over custom R&D.
Key insights
RAG retrieval is filtering structured document tables, not free-text search, separating anchor detection from context expansion.
Principles
- Codify expert document workflows.
- Anchor small, expand context large.
- Use LLMs for complex boundary detection.
Method
Retrieval involves two phases: first, finding precise "anchors" (lines, titles) using keyword/embedding detectors on `line_df` and `toc_df`; second, expanding these anchors into larger "contexts" (paragraphs, sections, N lines) for LLM generation.
In practice
- Filter `line_df` for text, `toc_df` for map.
- Join `line_df` and `toc_df` via `section_id`.
- Use LLMs for section-end detection.
Topics
- Enterprise RAG
- Document Intelligence
- Information Retrieval
- DataFrames
- LLM Context Management
- Table of Contents
- Anchor-Context Separation
Code references
Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.