How Much Static Structure Do Code Agents Need? A Study of Deterministic Anchoring

2026-06-26 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

A study introduces "deterministic anchoring," a method to enhance LLM-based code agents by injecting lightweight static analysis facts as plain-text comments. This approach aims to make agent navigation more disciplined and reproducible, addressing the stochastic nature of current grep-first agents. Evaluating on SWE-bench Lite and Verified, the research found that basic topological annotations (call/inheritance links) improved function-level localization by +2.2pp Func@5 on Lite and +1.2pp on Verified, while shortening interaction trajectories by 1.6 rounds on Lite. The effectiveness is scale-sensitive; for instance, large, hub-heavy repositories benefit more from inverse-only links. Furthermore, these "deterministic anchors" significantly stabilize agent behavior, increasing link-following rates from 0.15–0.18 to 0.21–0.24 and roughly halving run-to-run variance, leading to a +3.4 pp Pass@1 gain on medium-scale projects, at a cost of approximately 10% more input tokens.

Key takeaway

For AI Engineers deploying LLM-based code agents, you should integrate lightweight static analysis to inject structural comments. This "deterministic anchoring" will significantly improve agent navigation predictability and reproducibility, reducing run-to-run variance by roughly half on medium-scale projects. Consider using inverse-only links for large, hub-heavy repositories to avoid structural distractions. This approach enhances reliability and inspectability, crucial for production deployments, even with imperfect static analysis.

Key insights

Static structure injected as plain-text comments makes LLM code agent navigation disciplined and reproducible, not just "smarter."

Principles

Deterministic anchors improve agent navigation predictability.
Optimal structural granularity is repository-scale sensitive.
Lightweight topology provides most localization benefit.

Method

CodeAnchor performs offline static analysis to extract structural relationships (calls, inheritance, data flow) and injects them as compact, machine-parsable plain-text comments (tags) into source files for grep-first agents.

In practice

Default to lightweight topology for medium projects.
Prune forward edges in large, hub-heavy repositories.
Reserve dense tags for implicit-dependency cases.

Topics

LLM Code Agents
Static Analysis
Deterministic Anchoring
Repository Navigation
SWE-bench
Trajectory Stability
Program Comprehension

Code references

mathieu0905/Code-Anchor

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.