Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
Summary
An empirical study investigates the interaction between retrieval strategies and agent architecture in Large Language Model (LLM) agentic search systems. The research compares "grep" (keyword-based) and vector retrieval methods across various agent harnesses, including a custom Chronos harness and provider-native CLIs like Claude Code, Codex, and Gemini CLI. Experiment 1 evaluates these retrieval methods on a 116-question sample from LongMemEval, considering both inline and file-based tool result presentation. Experiment 2 examines the performance of grep-only versus vector-only retrieval as irrelevant conversational history is progressively introduced. The findings indicate that grep generally achieves higher accuracy than vector retrieval, although overall performance is significantly influenced by the specific agent harness and tool-calling style employed.
Key takeaway
For AI Architects designing agentic search systems, you should critically evaluate your retrieval strategy, as keyword-based "grep" methods demonstrated higher accuracy than vector retrieval in this study. Consider the specific agent harness and tool-calling paradigm, as these factors significantly influence overall performance, even with identical underlying data. Prioritize testing your chosen retrieval method's robustness against increasing amounts of irrelevant conversational history.
Key insights
Grep-based retrieval often outperforms vector retrieval in LLM agentic search, but performance varies by agent harness.
Principles
- Retrieval strategy impacts agent accuracy.
- Agent harness design influences performance.
- Irrelevant context degrades search efficacy.
Method
The study uses a 116-question LongMemEval sample to compare grep and vector retrieval, testing inline vs. file-based tool results and varying irrelevant conversational history with custom and provider CLIs.
In practice
- Prioritize grep for initial agentic search.
- Test retrieval across multiple agent harnesses.
- Monitor performance with increasing context noise.
Topics
- LLM Agents
- Agentic Search
- Retrieval-Augmented Generation
- Grep Retrieval
- Vector Retrieval
Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.