Evaluating Cross-lingual Knowledge Consistency in Code-Mixed vis-a-vis Indian Languages using IndicKLAR

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

IndiKLAR, an Indic extension of the KLAR-CLC benchmark, evaluates cross-lingual knowledge consistency in large language models for 18 of 22 scheduled Indian languages and 11 code-mixed variants. This benchmark reveals a significant knowledge recall gap, up to ~0.50, between native Indian languages and English. However, code-mixed inputs remarkably close most of this gap, bringing performance within ~0.05 of English without model-level intervention. The study also explores prompting strategies, including two-stage translate-then-answer, one-stage joint translation-and-answer, and Translate-in-Thought (TinT). A consistent "flip point" between incorrect and correct prediction is identified, lying between native and code-mixed settings, regardless of input form or internal model conversion.

Key takeaway

For NLP engineers deploying large language models in multilingual contexts, particularly with Indian languages, you should prioritize integrating code-mixed inputs. This approach can drastically reduce the cross-lingual knowledge recall gap from ~0.50 to ~0.05 compared to English, often without requiring model fine-tuning. Consider implementing advanced prompting strategies like Translate-in-Thought (TinT) to further enhance performance and consistency in low-resource language applications.

Key insights

Code-mixing inputs significantly improves large language model knowledge recall in Indian languages, nearly matching English performance.

Principles

Method

The study evaluates prompting strategies like two-stage translate-then-answer, one-stage joint translation-and-answer, and Translate-in-Thought (TinT) for language conversion.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.