To Describe or Not to Describe? Benchmarking Database Representations for Schema Linking in Text-to-SQL

· Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, medium

Summary

A study presented at PROPOR 2026 by Daiane Ucceli Kreitlow and Hilário Tomaz Alves de Oliveira investigates Schema Linking for Text-to-SQL systems, specifically for questions in Brazilian Portuguese. The research compares two schema representation strategies: natural-language descriptions generated by Large Language Models (LLMs) and representations derived from Data Definition Language (DDL) and Data Manipulation Language (DML) commands. Experiments were conducted on a Brazilian Portuguese adaptation of the Spider dataset, which includes over 200 databases. The evaluation involved several LLMs and embedding models, with results based on Hit@k metrics. The findings indicate that natural language descriptions consistently outperform DDL/DML-based representations, highlighting the superior effectiveness of LLM-generated schema descriptions for Schema Linking tasks in Text-to-SQL contexts.

Key takeaway

For AI Engineers developing Text-to-SQL systems, especially for non-English languages like Brazilian Portuguese, you should prioritize using Large Language Models to generate natural language descriptions of database schemas. This approach has been shown to consistently outperform DDL/DML-based representations in schema linking tasks, leading to more accurate identification of relevant databases, tables, and columns.

Key insights

LLM-generated natural language descriptions enhance Text-to-SQL schema linking more effectively than DDL/DML.

Principles

Method

The study compared LLM-generated natural language descriptions against DDL/DML representations for schema linking on a Brazilian Portuguese Spider dataset using Hit@k metrics.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.