How Much Static Structure Do Code Agents Need? A Study of Deterministic Anchoring

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

A study introduces "deterministic anchoring," a method to enhance LLM-based code agents by injecting lightweight static analysis facts as plain-text comments. This approach aims to make agent navigation more disciplined and reproducible, addressing the stochastic nature of current grep-first agents. Evaluating on SWE-bench Lite and Verified, the research found that basic topological annotations (call/inheritance links) improved function-level localization by +2.2pp Func@5 on Lite and +1.2pp on Verified, while shortening interaction trajectories by 1.6 rounds on Lite. The effectiveness is scale-sensitive; for instance, large, hub-heavy repositories benefit more from inverse-only links. Furthermore, these "deterministic anchors" significantly stabilize agent behavior, increasing link-following rates from 0.15–0.18 to 0.21–0.24 and roughly halving run-to-run variance, leading to a +3.4 pp Pass@1 gain on medium-scale projects, at a cost of approximately 10% more input tokens.

Key takeaway

For AI Engineers deploying LLM-based code agents, you should integrate lightweight static analysis to inject structural comments. This "deterministic anchoring" will significantly improve agent navigation predictability and reproducibility, reducing run-to-run variance by roughly half on medium-scale projects. Consider using inverse-only links for large, hub-heavy repositories to avoid structural distractions. This approach enhances reliability and inspectability, crucial for production deployments, even with imperfect static analysis.

Key insights

Static structure injected as plain-text comments makes LLM code agent navigation disciplined and reproducible, not just "smarter."

Principles

Method

CodeAnchor performs offline static analysis to extract structural relationships (calls, inheritance, data flow) and injects them as compact, machine-parsable plain-text comments (tags) into source files for grep-first agents.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.