Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

2026-05-14 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, medium

Summary

An empirical study investigated the performance of Large Language Model (LLM) agents using different retrieval methods and tool-calling styles. The research, detailed in "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search," involved two experiments. Experiment 1 compared `grep` and vector retrieval on 116 questions from LongMemEval, utilizing a custom agent harness (Chronos) and provider-native CLIs (Claude Code, Codex, Gemini CLI), assessing both inline and file-based tool results. Experiment 2 focused on `grep`-only versus vector-only retrieval, progressively introducing unrelated conversation history to evaluate robustness against distracting material. The findings indicate that `grep` generally achieves higher accuracy than vector retrieval across Chronos and provider CLIs in Experiment 1, though overall scores are significantly influenced by the specific harness and tool-calling approach used.

Key takeaway

For AI Architects and Research Scientists designing LLM agentic workflows, you should prioritize `grep`-based retrieval over vector retrieval for initial implementations, especially when accuracy is paramount. Your choice of agent harness and how tool outputs are presented significantly impacts performance, even with identical underlying data. Consider robust testing across different harnesses and tool output formats to optimize your agent's effectiveness and resilience against irrelevant context.

Key insights

`grep` often outperforms vector retrieval in LLM agentic search, but harness and tool-calling style are critical.

Principles

Tool output presentation impacts agent performance.
Irrelevant context degrades search accuracy.
Harness choice strongly influences agent scores.

Method

The study used two experiments: one comparing `grep` and vector retrieval across multiple harnesses and tool-calling styles, and another assessing retrieval robustness against increasing irrelevant conversational noise.

In practice

Prioritize `grep` for initial agentic search implementations.
Carefully select agent harnesses for optimal performance.
Design tool output presentation for clarity.

Topics

LLM Agents
Agentic Search
Grep Retrieval
Vector Retrieval
Agent Harnesses

Code references

Best for: AI Architect, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.