IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages
Summary
IndicDB is a new multilingual Text-to-SQL benchmark designed to evaluate cross-lingual semantic parsing in diverse Indic languages, addressing a gap in existing benchmarks that primarily focus on Western contexts and simplified schemas. The benchmark utilizes relational schemas derived from real-world administrative data on open-data platforms like the National Data and Analytics Platform (NDAP) and the India Data Portal (IDP). IndicDB features 20 databases across 237 tables, with an average of 11.85 tables per database and join depths up to six. An iterative three-agent framework (Architect, Auditor, Refiner) was used to convert denormalized government data into rich relational structures. The benchmark includes 15,617 tasks in English, Hindi, and five other Indic languages. Evaluations of models such as DeepSeek v3.2, MiniMax 2.7, LLaMA 3.3, and Qwen3 revealed a 9.00% performance drop from English to Indic languages, termed the "Indic Gap," attributed to harder schema linking, increased structural ambiguity, and limited external knowledge.
Key takeaway
For research scientists developing or fine-tuning Large Language Models for Text-to-SQL applications, you should prioritize evaluating your models against multilingual benchmarks like IndicDB. The identified "Indic Gap" highlights the need to specifically address challenges in schema linking, structural ambiguity, and external knowledge for non-Western languages. Focusing on these areas will improve real-world performance and applicability in diverse linguistic contexts.
Key insights
IndicDB benchmarks multilingual Text-to-SQL for Indic languages, revealing a significant performance gap compared to English.
Principles
- Real-world data increases benchmark complexity.
- Multilingual LLM performance varies by language.
- Schema linking is critical for Text-to-SQL.
Method
An iterative three-agent framework (Architect, Auditor, Refiner) converts denormalized government data into rich relational schemas, ensuring structural rigor and high relational density for Text-to-SQL benchmarks.
In practice
- Evaluate LLMs on IndicDB for Text-to-SQL.
- Focus on schema linking for Indic languages.
- Consider data complexity in benchmark design.
Topics
- Multilingual Text-to-SQL
- Indic Languages
- LLM Benchmarking
- Semantic Parsing
- Relational Databases
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.