PracRepair: LLM-Empowered Automated Program Repair Inspired by Human-Like Debugging Practices

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

PracRepair is a novel LLM-based automated program repair (APR) framework. It mimics human debugging practices, addressing limitations of existing LLM-based APR methods that underutilize dynamic information. Current approaches often rely on static context, error messages, and coarse validation, overlooking crucial failure-execution and patch-validation dynamics. PracRepair constructs an on-demand static-dynamic context from buggy programs and failure executions. It then employs question-driven failure diagnosis to generate explicit repair hypotheses. Patches are iteratively refined using validation diagnostics and trace-level behavioral changes. Experimental evaluations on Defects4J V1.2 and V2.0 demonstrate its superior performance. With GPT-3.5, PracRepair fixed 139 bugs on V1.2 and 136 on V2.0. Using GPT-4o, these numbers improved to 162 and 171 bugs, respectively. The framework also generalizes effectively to Real-World Bugs (RWB), achieving top performance across various foundation models.

Key takeaway

For Machine Learning Engineers developing automated program repair solutions, integrate dynamic execution and validation feedback into your LLM-based frameworks. PracRepair demonstrates that utilizing on-demand static-dynamic context and iterative patch refinement significantly boosts bug-fixing capabilities. This human-inspired debugging approach improves performance on Defects4J and real-world bugs, suggesting a path to more robust and effective repair systems.

Key insights

PracRepair enhances LLM-based automated program repair by integrating dynamic execution and validation feedback, mimicking human debugging.

Principles

Method

PracRepair constructs on-demand static-dynamic context, performs question-driven failure diagnosis for hypotheses, and iteratively refines patches using validation diagnostics and trace-level behavioral changes.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.