Exploration Structure in LLM Agents for Multi-File Change Localization

· Source: Artificial Intelligence · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

A new study investigates exploration structures for LLM-based agents tasked with multi-file change localization in software issues, addressing the inefficiency of linear repository exploration for changes spanning multiple subsystems. The research compares linear sequential exploration against a novel non-linear, domain-scoped parallel agentic approach. Using SWE Bench Pro and an expanded benchmark including recent PRs from 2025 and 2026, the study found that domain-scoped parallel agent spawning with a small Haiku-class model achieved the highest micro F1 among Haiku models. This approach was second only to the larger Codex 5.5 High on the expanded benchmark. On the original 2020 SWE-bench Pro, a Sonnet plain LLM baseline showed higher micro F1 by predicting fewer files, yielding high precision but low recall. Additional findings highlight documentation evolution as an unresolved latent dependency, the degradation of localization from naive file system access due to test-file overprediction, and the ineffectiveness and high token cost of forced multi-agent consultation.

Key takeaway

For AI Engineers developing LLM agents for multi-file code changes, prioritize implementing non-linear, domain-scoped parallel exploration over traditional linear methods. Your agents will achieve higher localization accuracy, as demonstrated by Haiku-class models. Additionally, avoid naive file system access that can lead to test-file overprediction, and be wary of forced multi-agent consultation, which significantly increases token costs without measurable benefit.

Key insights

Non-linear, domain-scoped parallel exploration significantly improves LLM agent performance for multi-file change localization.

Principles

Method

Spawn parallel agents, each exploring a specific domain concurrently, to localize multi-file changes.

In practice

Topics

Best for: Research Scientist, AI Scientist, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.