Why NL-to-SQL Systems Fail: Lessons from building a Semantic SQL validation pipeline
Summary
Building robust Natural Language to SQL (NL-to-SQL) systems involves more than just generating syntactically correct SQL; the primary challenges lie in ensuring semantic correctness over large relational systems. Initial approaches often fail due to issues like semantically incorrect joins, hallucinated column names, invalid relationship traversal, schema ambiguity, and context retrieval failures. A key realization was that a query can execute successfully but still be logically wrong. The author's experience led to the development of a layered semantic retrieval and SQL validation system built around PostgreSQL. This system evolved to include representing relational metadata as a graph for dynamic path discovery, careful prompt formatting to prevent column hallucination, and a multi-stage SQL refinement layer incorporating AST-based validation, schema-aware checks, and query plan analysis to detect performance issues.
Key takeaway
For AI Engineers building NL-to-SQL solutions, prioritize semantic validation and architectural layering over simple prompt engineering. Your systems must explicitly guide LLMs on relational topology and validate generated SQL against schema and performance metrics before execution. Never allow raw LLM output to directly query production databases; instead, implement a robust refinement pipeline to prevent dangerous, semantically incorrect queries that appear valid.
Key insights
Semantic correctness and robust validation are paramount for reliable NL-to-SQL systems, beyond mere syntax.
Principles
- LLMs require explicit relationship topology guidance.
- Prompt formatting significantly impacts generation quality.
- Context quality outweighs context quantity for retrieval.
Method
Represent relational schema as a graph, use BFS for path discovery, and inject explicit join guidance. Implement a multi-stage SQL refinement pipeline with AST validation, schema-aware checks, and query plan analysis.
In practice
- Use `pglast` for AST-based SQL validation.
- Analyze query plans with `EXPLAIN (FORMAT JSON)`.
- Provide column names without table prefixes in prompts.
Topics
- NL-to-SQL Systems
- Semantic SQL Validation
- Large Language Models
- Graph Traversal
- Abstract Syntax Tree
Best for: AI Engineer, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Naturallanguageprocessing on Medium.