VulKey: Automated Vulnerability Repair Guided by Domain-Specific Repair Patterns

2026-06-30 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Software Development & Engineering · Depth: Expert, extended

Summary

VulKey is an LLM-based Automated Vulnerability Repair (AVR) framework designed to integrate structured security knowledge, addressing a gap where existing LLM approaches struggle with sources like CWE and NVD. It proposes a novel three-level abstraction for repair strategies, encompassing CWE type, syntactic actions, and semantic key elements, offering greater generality and semantic richness than prior methods. Implemented as a two-stage pipeline, VulKey first predicts an appropriate repair pattern using expert knowledge matching, then generates secure patches with a pattern-guided, fine-tuned LLM. On the real-world C/C++ dataset PrimeVul, VulKey achieves 31.5% repair accuracy, outperforming the best baseline by 7.6% and leading tools like VulMaster and GPT-5. It also demonstrates cross-language generalizability with state-of-the-art performance on the Java benchmark Vul4J.

Key takeaway

For AI Scientists and Machine Learning Engineers developing automated vulnerability repair tools, you should prioritize integrating structured, hierarchical expert security knowledge. Leveraging VulKey's three-level abstraction (CWE type, syntactic actions, semantic key elements) can significantly improve patch generation accuracy and generalizability across languages. Consider decoupling pattern matching from code generation to allow independent optimization and enhance adaptability to new vulnerability types.

Key insights

Structured expert knowledge, hierarchically abstracted, significantly enhances LLM-based automated vulnerability repair.

Principles

Hierarchical patterns improve repair generality.
Decouple pattern matching from patch generation.
Semantic key elements disambiguate repair strategies.

Method

VulKey uses a two-stage pipeline: a CodeT5p-based matcher predicts (Action, Key Element) patterns from CWE type and vulnerable code, then a progressively fine-tuned LLM generates patches guided by these patterns.

In practice

Abstract security fixes into CWE type, syntax, semantics.
Fine-tune LLMs with general bug-fix and security-specific data.
Use context-aware pattern matching for targeted fixes.

Topics

Automated Vulnerability Repair
Large Language Models
CWE
CodeT5p
PrimeVul Dataset
Hierarchical Abstraction
Software Security

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.