RAG is dead, right?? — Kuba Rogut, Turbopuffer

· Source: AI Engineer · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Intermediate, long

Summary

Kuba Rogut from Turbopuffer challenges the "RAG is dead" narrative, asserting that while simple vector search RAG is evolving, hybrid tool-rich retrieval is becoming essential for serious agentic search. He clarifies that RAG encompasses various retrieval methods beyond just vector search, including full-text search and regex. Agentic search, as defined by Turbopuffer, involves agents iteratively using a set of tools to find and reason over context. Cursor, an early Turbopuffer customer, demonstrates this by indexing codebases using Merkle trees and semantic search, achieving a 12.5% to 13.5% average increase in answer accuracy and up to 24% for their Composer model. This upfront indexing, unlike Cloud Code's per-session discovery, acts as "cash compute," saving tokens and time. The shift is towards iterative, multi-step agentic retrieval, echoing Jeff Dean's principle: "you don't need a trillion at once, you need the right million."

Key takeaway

For AI Engineers building sophisticated agentic systems, recognize that simple, one-shot RAG is insufficient for current demands. You should prioritize implementing hybrid retrieval strategies that combine vector and full-text search with iterative agent reasoning. Consider upfront indexing, like Cursor's Merkle tree approach, to optimize "cash compute" and significantly improve answer accuracy and user retention, rather than relying solely on per-session discovery.

Key insights

Hybrid tool-rich retrieval and iterative agentic search are replacing simple RAG for advanced context understanding and performance gains.

Principles

Method

Agents progressively and iteratively find and reason over context using a suite of tools like vector search, full-text search (BM25), grepping, globbing, and regex, fetching only what's needed.

In practice

Topics

Best for: NLP Engineer, AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Engineer.