Understanding, Detecting, and Repairing Real-World In-Context-Learning-Based Text-to-SQL Errors
Summary
A comprehensive study reveals that Large Language Models (LLMs) performing text-to-SQL tasks using in-context learning (ICL) frequently generate erroneous SQL queries, with 37.3% of queries containing errors across four ICL-based techniques, two benchmarks (Spider and Bird), and two LLM settings (GPT-3.5-Turbo-0125 and GPT-4o-2024-05-13). Researchers categorized 29 error types into 7 categories, finding 26.0% are format-related and 30.9% are semantic. Existing repairing methods offer limited correctness improvement (10.9-23.3% fixed) at high computational cost (1.03-3.82x latency) and introduce 5.3-40.1% new errors. To address this, MapleRepair, a novel detection and repairing framework, was developed. It repairs 13.8% more queries with 84.9% fewer mis-repairs and 67.4% less overhead, processing queries in 1.2 seconds.
Key takeaway
For AI Scientists and Machine Learning Engineers developing text-to-SQL solutions, recognize that LLM-generated SQL is prone to specific, classifiable errors. You should prioritize integrating robust, rule-based error detection and repair mechanisms like MapleRepair. This approach significantly reduces mis-repairs and computational overhead compared to relying solely on LLM self-correction, improving the reliability and efficiency of your text-to-SQL systems.
Key insights
LLM-generated SQL queries have widespread, categorized errors, requiring targeted, efficient repair solutions.
Principles
- LLMs struggle with SQL syntax and schema comprehension.
- Execution feedback significantly aids SQL error repair.
- Untargeted LLM-based repair can worsen SQL errors.
Method
MapleRepair uses a multi-stage, rule-based detection and repair system, selectively invoking LLMs for complex errors. It prioritizes fixing syntax, schema, logic, and convention errors before addressing semantic issues.
In practice
- Implement rule-based checks for common SQL syntax errors.
- Provide execution results and value specifications to LLMs.
- Avoid blanket LLM re-generation for all SQL queries.
Topics
- Text-to-SQL
- Large Language Models
- In-Context Learning
- SQL Error Detection
- SQL Error Repair
- MapleRepair
- Database Query Generation
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.