DecoSearch: Complexity-Aware Routing and Plan-Level Repair for Text-to-SQL

2026-06-16 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

DecoSearch is a novel training-free framework designed to enhance Large Language Models' (LLMs) performance in translating natural language to SQL, particularly for complex queries requiring multi-step, data-aware reasoning. It employs a lightweight Schema Selector to prune database schemas and an LLM Judger to route questions, either directly generating SQL for simple queries or escalating complex ones to a Directed Acyclic Graph (DAG) of sub-questions. A RAG component grounds the decomposer, and a Topology Refiner repairs flawed reasoning plans. DecoSearch achieves 70.53% execution accuracy on BIRD and 88.31% on Spider with a DeepSeek backbone, consuming significantly fewer tokens than competing methods, and functions as a model-agnostic wrapper to improve existing SQL generation backbones.

Key takeaway

For Machine Learning Engineers or NLP Engineers building Text-to-SQL solutions, DecoSearch offers a compelling training-free approach to significantly improve accuracy on complex queries. You should consider integrating this framework as a model-agnostic wrapper to enhance your existing fine-tuned SQL generation backbones, potentially reducing token consumption by an order of magnitude while achieving higher execution accuracy on benchmarks like BIRD and Spider.

Key insights

DecoSearch improves Text-to-SQL accuracy for complex queries via complexity-aware routing and plan-level repair.

Principles

Route queries by complexity for optimal reasoning effort.
Decompose complex Text-to-SQL questions into sub-questions.
Repair flawed reasoning plans at the topology level.

Method

DecoSearch prunes schema, judges query complexity, decomposes complex queries into DAGs of sub-questions, grounds with RAG, and refines the plan if execution fails.

In practice

Implement schema pruning to reduce LLM context.
Break down complex natural language queries into atomic sub-questions.
Integrate RAG to provide relevant training examples for decomposition.

Topics

Text-to-SQL
Large Language Models
Query Decomposition
Schema Pruning
Retrieval-Augmented Generation
Execution Accuracy

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.