Neurosymbolic Repo-level Code Localization

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

Code localization, a key component of autonomous software engineering, currently suffers from a "Keyword Shortcut" bias in existing benchmarks. These benchmarks, despite showing impressive performance, allow models to rely on superficial lexical matching due to abundant keyword references like file paths and function names, rather than requiring genuine structural reasoning. To address this, researchers formalized the Keyword-Agnostic Logical Code Localization (KA-LCL) challenge and introduced KA-LogicQuery, a diagnostic benchmark designed to necessitate structural reasoning without naming hints. State-of-the-art approaches show a catastrophic performance drop on KA-LogicQuery, highlighting their lack of deterministic reasoning. In response, LogicLoc, a novel agentic framework, combines large language models (LLMs) with Datalog's logical reasoning for precise localization. LogicLoc extracts program facts, synthesizes Datalog programs via an LLM with parser-gated validation and mutation-based feedback, and executes them with a high-performance engine, achieving accurate and verifiable localization.

Key takeaway

For research scientists developing autonomous software engineering tools, you should prioritize evaluating code localization models on benchmarks like KA-LogicQuery that demand structural reasoning over lexical matching. Relying solely on current issue-driven benchmarks risks deploying systems that fail catastrophically in real-world scenarios lacking explicit keyword hints. Consider integrating neurosymbolic approaches, such as LogicLoc's Datalog-based framework, to achieve more robust, verifiable, and efficient localization capabilities.

Key insights

Existing code localization benchmarks exhibit a "Keyword Shortcut" bias, hindering genuine structural reasoning in models.

Principles

Method

LogicLoc uses an LLM to synthesize Datalog programs from codebase facts, validated by a parser and mutation feedback, then executes them with a high-performance engine for accurate, verifiable code localization.

In practice

Topics

Best for: Research Scientist, AI Scientist, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.