VulnAgent-R2: Evidence-Calibrated Multi-Agent Auditing for Repository-Level Vulnerability Detection

2026-06-04 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Software Development & Engineering · Depth: Expert, long

Summary

VulnAgent-X is a layered agentic framework designed for repository-level software vulnerability detection, addressing limitations of existing methods like local code views and one-shot prediction. It integrates lightweight risk screening, bounded context expansion, specialized analysis agents, selective dynamic verification, and evidence fusion into a unified pipeline. Experiments on function-level (Devign, Big-Vul, PrimeVul) and just-in-time (JULY) vulnerability benchmarks demonstrate VulnAgent-X's superior performance over static baselines, encoder-based models (CodeBERT, GraphCodeBERT), and simpler agentic variants. The framework achieves better localization and balanced performance-cost trade-offs, reducing average cost by 39.6% fewer tokens and 35.1% less time compared to "always-full" execution, with 62.4% of samples resolved via early exit. Key settings include a suspicious-region budget K=8 and context budget B=6,000 tokens.

Key takeaway

For AI Security Engineers evaluating repository-level vulnerability detection tools, VulnAgent-X offers a robust, evidence-calibrated approach that significantly reduces false positives and improves localization. You should consider adopting multi-agent frameworks that integrate staged analysis and selective dynamic verification. This method enhances detection quality and provides interpretable results, crucial for managing security risks in complex software projects.

Key insights

Vulnerability detection is improved by a staged, evidence-driven auditing process using multi-agent collaboration and selective verification.

Principles

Decompose vulnerability detection into staged inference.
Fuse diverse evidence for robust security decisions.
Prioritize selective dynamic verification for high-risk findings.

Method

VulnAgent-X performs fast risk screening, expands context, dispatches specialized agents (Router, Semantic, Security, Logic Bug, Sceptic), selectively verifies high-risk findings, and fuses evidence for a final decision.

In practice

Implement a Sceptic Agent to reduce false positives by seeking counter-evidence.
Use threshold-based escalation to balance detection reliability and runtime cost.
Limit context expansion to B=6,000 tokens for efficiency.

Topics

Vulnerability Detection
Multi-Agent Systems
Repository Analysis
Code Security Auditing
Large Language Models
Dynamic Verification

Code references

xiaolu-666113/Vlun-Agent-X

Best for: Research Scientist, AI Scientist, AI Security Engineer, Software Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.