IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

IndicDB is a new multilingual Text-to-SQL benchmark designed to evaluate cross-lingual semantic parsing in diverse Indic languages, addressing a gap in existing benchmarks that primarily focus on Western contexts and simplified schemas. The benchmark utilizes relational schemas derived from real-world administrative data on open-data platforms like the National Data and Analytics Platform (NDAP) and the India Data Portal (IDP). IndicDB features 20 databases across 237 tables, with an average of 11.85 tables per database and join depths up to six. An iterative three-agent framework (Architect, Auditor, Refiner) was used to convert denormalized government data into rich relational structures. The benchmark includes 15,617 tasks in English, Hindi, and five other Indic languages. Evaluations of models such as DeepSeek v3.2, MiniMax 2.7, LLaMA 3.3, and Qwen3 revealed a 9.00% performance drop from English to Indic languages, termed the "Indic Gap," attributed to harder schema linking, increased structural ambiguity, and limited external knowledge.

Key takeaway

For research scientists developing or fine-tuning Large Language Models for Text-to-SQL applications, you should prioritize evaluating your models against multilingual benchmarks like IndicDB. The identified "Indic Gap" highlights the need to specifically address challenges in schema linking, structural ambiguity, and external knowledge for non-Western languages. Focusing on these areas will improve real-world performance and applicability in diverse linguistic contexts.

Key insights

IndicDB benchmarks multilingual Text-to-SQL for Indic languages, revealing a significant performance gap compared to English.

Principles

Method

An iterative three-agent framework (Architect, Auditor, Refiner) converts denormalized government data into rich relational schemas, ensuring structural rigor and high relational density for Text-to-SQL benchmarks.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.