Agent trajectories as programs: fingerprinting and programming coding-agent behavior

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

This work introduces methods for procedurally comparing coding agents, defining "behavioral fingerprints" as identifiable habits. Analyzing ten agents across four scaffolds (SWE-agent, Agentless, DARS, Moatless) and models like GPT, Claude, DeepSeek, and Qwen, the research found agents are identifiable by these fingerprints at 85.7% accuracy. It develops procedural representations using an emergent vocabulary induction technique, applying this to the SWE-Bench dataset. Findings indicate behavioral similarity between models from similar release periods and distilled pairs, with a student model and its teacher showing a Jensen-Shannon divergence of 0.25. The study also releases ProcGrep, a library for auditing agent traces, which outperforms LLMs in episodic search, offering deterministic, programmable search over agent trajectories.

Key takeaway

For AI Engineers shaping coding agents, understanding procedural fingerprints is crucial for informed model selection and configuration. You should utilize tools like ProcGrep to audit agent traces, enabling deterministic search and fine-grained behavioral analysis. This allows you to programmatically define and reward desired problem-solving styles, optimize for cost efficiency, and improve task-aware model routing, moving beyond simple success rates to holistic agent evaluation.

Key insights

Agent problem-solving styles are unique "fingerprints" recoverable from procedural traces, enabling new evaluation and programming methods.

Principles

Method

Develop procedural representations via emergent vocabulary induction using BPE, determining vocabulary stability with V-measure (K=192) based on homogeneity and completeness.

In practice

Topics

Code references

Best for: Research Scientist, AI Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.