Building Cost-Efficient Agentic RAG on Long-Text Documents in SQL Tables

· Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, long

Summary

This article presents an Agentic RAG (Retrieval Augmented Generation) architecture designed to operate directly on traditional SQL databases containing large documents in long-text fields, without requiring schema changes. The system addresses the challenge of combining structured computation, deep semantic understanding, and contextual insights from existing enterprise data. It utilizes a ReAct agent to orchestrate queries, intelligently deciding whether to use a SQL tool for computations and aggregations, a Vector tool for semantic search, or a hybrid approach combining both. The architecture was demonstrated using a subset of the Social Animal 10K Articles with NLP dataset, with `gemini-2.5-flash` as the LLM and `FAISS` for vector embeddings. Key design principles include careful management of tool docstrings versus system prompts and understanding the implications of pre- and post-filtering in vector databases.

Key takeaway

For AI Engineers and Data Scientists tasked with integrating LLMs into existing enterprise data infrastructure, this Agentic RAG architecture offers a robust solution. You can leverage your current SQL databases for both structured queries and semantic search without costly schema migrations. Pay close attention to the agent's routing logic and the choice of vector database filtering (pre- vs. post-filtering) to ensure reliable and accurate retrieval, especially for complex or hybrid queries.

Key insights

An Agentic RAG architecture can integrate LLMs with existing SQL databases for hybrid structured and semantic queries.

Principles

Method

An Agentic RAG system employs a ReAct agent to route queries to specialized SQL and Vector tools. It uses metadata mirroring for filtering and handles computational, semantic, and hybrid queries by selecting the appropriate tool or combination.

In practice

Topics

Best for: Machine Learning Engineer, AI Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.