Automated Root-Cause Subclassification and No-Code Fix Generation for Invalid Bug Reports
Summary
A study by Mahmut Furkan Gön et al. introduces and evaluates an automated framework, IssueSupport, for subclassifying invalid bug reports and generating no-code fixes. The research establishes a standardized taxonomy for root-cause-oriented invalid bug report subclassification, including categories like "External System & Dependency Issues" and "Faulty Configuration." Using a manually curated benchmark from the Brave browser repository, the authors experimented with vanilla LLMs, Retrieval Augmented Generation (RAG), and agentic web search. For subclassification, RAG achieved the highest overall weighted F1-score of 0.66, slightly outperforming vanilla LLMs (0.65) and agentic web search (0.64). For no-code fix generation, agentic web search with Gemini 3.1 Pro achieved the highest overall Judge LLM success rate at 68.9%, surpassing RAG (64.4%) and vanilla LLMs (64.9%). The study highlights that different approaches are more effective for distinct tasks, with RAG excelling in subclassification and agentic web search in fix generation.
Key takeaway
For AI Engineers developing bug triage systems, consider a hybrid approach: initially ground your LLM with project-specific RAG for robust invalid bug report subclassification, then integrate agentic web search for generating actionable no-code fixes. This strategy optimizes for both accurate categorization and effective resolution, particularly for complex issues like "Wrong Version" or "Faulty Configuration," which benefit from dynamic external context. Always validate generated fixes using a Judge LLM to ensure functional correctness, rather than relying solely on semantic similarity.
Key insights
Automating invalid bug report subclassification and no-code fix generation significantly reduces software maintenance overhead.
Principles
- Context engineering improves LLM performance.
- Different LLM approaches suit different tasks.
- Semantic similarity metrics can be misleading.
Method
The IssueSupport framework uses LLMs, RAG, and agentic web search to classify invalid bug reports by root cause and generate no-code fixes, evaluated via F1-Score, BERTScore, and Judge LLM success rates.
In practice
- Use RAG for bug report subclassification.
- Employ agentic web search for no-code fix generation.
- Prioritize Judge LLM evaluation over BERTScore.
Topics
- Invalid Bug Reports
- Root Cause Subclassification
- No-Code Fix Generation
- Large Language Models
- Retrieval-Augmented Generation
Code references
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.