Language Effects in Text-to-SQL Across English and Portuguese
Summary
A study evaluated various Large Language Models (LLMs) on Text-to-SQL tasks using the BIRD benchmark and its Portuguese translation, BIRD_PT. Researchers translated questions and external knowledge into Portuguese while retaining the original English database schema and values. The evaluation compared four scenarios: varying internal reasoning and guided reasoning for SQL generation. Results consistently showed decreased accuracy when switching from English to Portuguese, with significant model-specific robustness variations. Reasoning alone did not reliably improve execution accuracy and sometimes reduced performance in Portuguese. However, combining reasoning with a guided plan offered the most stable improvements, though still weaker than English performance. These findings underscore persistent challenges in multilingual Text-to-SQL.
Key takeaway
For research scientists developing multilingual Text-to-SQL systems, you should prioritize integrating guided planning with reasoning mechanisms. Simply activating reasoning in LLMs may not improve, and could even degrade, performance in non-English languages like Portuguese. Focus on task-aligned planning to mitigate the observed accuracy decrease when adapting systems to new languages.
Key insights
Multilingual Text-to-SQL accuracy drops significantly from English to Portuguese, even with advanced LLMs.
Principles
- Language choice impacts Text-to-SQL accuracy.
- Reasoning alone doesn't guarantee performance gains.
- Guided planning stabilizes reasoning improvements.
Method
LLMs were evaluated on BIRD and BIRD_PT (Portuguese translation of questions/knowledge, English schema) across four reasoning and guided reasoning scenarios.
In practice
- Prioritize guided reasoning for multilingual Text-to-SQL.
- Expect accuracy drops when localizing Text-to-SQL.
- Jointly consider language, reasoning, and planning.
Topics
- Text-to-SQL
- Large Language Models
- Multilingual NLP
- BIRD benchmark
- Portuguese Language
Best for: Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.