Language Effects in Text-to-SQL Across English and Portuguese

2026-04-12 · Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

A study evaluated various Large Language Models (LLMs) on Text-to-SQL tasks using the BIRD benchmark and its Portuguese translation, BIRD_PT. Researchers translated questions and external knowledge into Portuguese while retaining the original English database schema and values. The evaluation compared four scenarios: varying internal reasoning and guided reasoning for SQL generation. Results consistently showed decreased accuracy when switching from English to Portuguese, with significant model-specific robustness variations. Reasoning alone did not reliably improve execution accuracy and sometimes reduced performance in Portuguese. However, combining reasoning with a guided plan offered the most stable improvements, though still weaker than English performance. These findings underscore persistent challenges in multilingual Text-to-SQL.

Key takeaway

For research scientists developing multilingual Text-to-SQL systems, you should prioritize integrating guided planning with reasoning mechanisms. Simply activating reasoning in LLMs may not improve, and could even degrade, performance in non-English languages like Portuguese. Focus on task-aligned planning to mitigate the observed accuracy decrease when adapting systems to new languages.

Key insights

Multilingual Text-to-SQL accuracy drops significantly from English to Portuguese, even with advanced LLMs.

Principles

Language choice impacts Text-to-SQL accuracy.
Reasoning alone doesn't guarantee performance gains.
Guided planning stabilizes reasoning improvements.

Method

LLMs were evaluated on BIRD and BIRD_PT (Portuguese translation of questions/knowledge, English schema) across four reasoning and guided reasoning scenarios.

In practice

Prioritize guided reasoning for multilingual Text-to-SQL.
Expect accuracy drops when localizing Text-to-SQL.
Jointly consider language, reasoning, and planning.

Topics

Text-to-SQL
Large Language Models
Multilingual NLP
BIRD benchmark
Portuguese Language

Best for: Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.