Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

2026-05-14 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, quick

Summary

An empirical study investigates the interaction between retrieval strategies and agent architecture in Large Language Model (LLM) agentic search systems. The research compares "grep" (keyword-based) and vector retrieval methods across various agent harnesses, including a custom Chronos harness and provider-native CLIs like Claude Code, Codex, and Gemini CLI. Experiment 1 evaluates these retrieval methods on a 116-question sample from LongMemEval, considering both inline and file-based tool result presentation. Experiment 2 examines the performance of grep-only versus vector-only retrieval as irrelevant conversational history is progressively introduced. The findings indicate that grep generally achieves higher accuracy than vector retrieval, although overall performance is significantly influenced by the specific agent harness and tool-calling style employed.

Key takeaway

For AI Architects designing agentic search systems, you should critically evaluate your retrieval strategy, as keyword-based "grep" methods demonstrated higher accuracy than vector retrieval in this study. Consider the specific agent harness and tool-calling paradigm, as these factors significantly influence overall performance, even with identical underlying data. Prioritize testing your chosen retrieval method's robustness against increasing amounts of irrelevant conversational history.

Key insights

Grep-based retrieval often outperforms vector retrieval in LLM agentic search, but performance varies by agent harness.

Principles

Retrieval strategy impacts agent accuracy.
Agent harness design influences performance.
Irrelevant context degrades search efficacy.

Method

The study uses a 116-question LongMemEval sample to compare grep and vector retrieval, testing inline vs. file-based tool results and varying irrelevant conversational history with custom and provider CLIs.

In practice

Prioritize grep for initial agentic search.
Test retrieval across multiple agent harnesses.
Monitor performance with increasing context noise.

Topics

LLM Agents
Agentic Search
Retrieval-Augmented Generation
Grep Retrieval
Vector Retrieval

Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.