How to Build Agentic RAG with Hybrid Search
Summary
Retrieval Augmented Generation (RAG) systems traditionally rely on vector similarity to find relevant document chunks for Large Language Models (LLMs). While effective for semantic matching, vector similarity struggles with specific keywords or IDs, where individual terms can be "drowned out." Hybrid search combines vector similarity with keyword search, such as BM25, to overcome this limitation, improving the retrieval of highly specific information. Implementing hybrid search involves integrating both retrieval methods and applying a weighting to their respective similarity scores. The article further advocates for an "agentic" RAG system, where an LLM acts as an agent, dynamically deciding search prompts, iteratively fetching information, and adjusting the weighting between keyword and vector search for superior performance.
Key takeaway
For AI Engineers building RAG systems, integrating hybrid search is crucial for handling queries with specific keywords or IDs that vector similarity alone misses. You should also consider making your RAG system agentic, allowing the LLM to dynamically adjust search parameters and iteratively refine results, which can significantly boost retrieval performance beyond static configurations.
Key insights
Hybrid search, combining vector and keyword methods, significantly enhances RAG system accuracy, especially for specific queries.
Principles
- Vector similarity excels at semantic matching.
- Keyword search is superior for explicit identifiers.
- Agentic LLMs can dynamically optimize RAG parameters.
Method
Implement vector retrieval and a keyword search algorithm (e.g., BM25), then apply a weighting between their similarity scores. For agentic RAG, make chunk retrieval a tool for the LLM.
In practice
- Use BM25 for keyword search.
- Consider TurboPuffer for hybrid search.
- Allow LLM agents to rewrite search queries.
Topics
- Retrieval-Augmented Generation
- Hybrid Search
- Vector Similarity
- Keyword Search (BM25)
- Agentic LLMs
Best for: AI Engineer, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.