How Much Static Structure Do Code Agents Need? A Study of Deterministic Anchoring
Summary
A study introduces "deterministic anchoring," a method to enhance LLM-based code agents by injecting lightweight static analysis facts as plain-text comments. This approach aims to make agent navigation more disciplined and reproducible, addressing the stochastic nature of current grep-first agents. Evaluating on SWE-bench Lite and Verified, the research found that basic topological annotations (call/inheritance links) improved function-level localization by +2.2pp Func@5 on Lite and +1.2pp on Verified, while shortening interaction trajectories by 1.6 rounds on Lite. The effectiveness is scale-sensitive; for instance, large, hub-heavy repositories benefit more from inverse-only links. Furthermore, these "deterministic anchors" significantly stabilize agent behavior, increasing link-following rates from 0.15–0.18 to 0.21–0.24 and roughly halving run-to-run variance, leading to a +3.4 pp Pass@1 gain on medium-scale projects, at a cost of approximately 10% more input tokens.
Key takeaway
For AI Engineers deploying LLM-based code agents, you should integrate lightweight static analysis to inject structural comments. This "deterministic anchoring" will significantly improve agent navigation predictability and reproducibility, reducing run-to-run variance by roughly half on medium-scale projects. Consider using inverse-only links for large, hub-heavy repositories to avoid structural distractions. This approach enhances reliability and inspectability, crucial for production deployments, even with imperfect static analysis.
Key insights
Static structure injected as plain-text comments makes LLM code agent navigation disciplined and reproducible, not just "smarter."
Principles
- Deterministic anchors improve agent navigation predictability.
- Optimal structural granularity is repository-scale sensitive.
- Lightweight topology provides most localization benefit.
Method
CodeAnchor performs offline static analysis to extract structural relationships (calls, inheritance, data flow) and injects them as compact, machine-parsable plain-text comments (tags) into source files for grep-first agents.
In practice
- Default to lightweight topology for medium projects.
- Prune forward edges in large, hub-heavy repositories.
- Reserve dense tags for implicit-dependency cases.
Topics
- LLM Code Agents
- Static Analysis
- Deterministic Anchoring
- Repository Navigation
- SWE-bench
- Trajectory Stability
- Program Comprehension
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.