Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs

2026-04-21 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new test set, LocQA, has been developed to quantify implicit inter- and intra-lingual biases in multilingual large language models (LLMs). LocQA comprises 2,156 locale-ambiguous questions across 12 languages, covering facts like laws, dates, and measurements, without explicit locale indicators. By evaluating 32 LLMs with LocQA, researchers identified two structural biases. Inter-lingually, models exhibit a global bias towards US-locale answers, even when queried in non-English languages, a bias exacerbated in instruction-tuned models compared to their base versions. Intra-lingually, models prioritize locales with larger populations when multiple locales are relevant for a single language, effectively acting as demographic probability engines. These findings offer insights for shaping LLM local behavior and assessing training phase impacts on bias.

Key takeaway

For research scientists and engineers developing multilingual LLMs, understanding these implicit biases is crucial. Your models likely default to US-centric answers and prioritize locales by population size, especially if instruction-tuned. You should integrate bias detection tools like LocQA into your evaluation pipelines to identify and mitigate these structural biases, ensuring more culturally neutral and accurate responses across diverse linguistic contexts.

Key insights

Multilingual LLMs exhibit implicit US-centric and demographic biases when answering locale-ambiguous questions.

Principles

Instruction tuning exacerbates global biases.
LLMs act as demographic probability engines.

Method

LocQA is a test set of 2,156 locale-ambiguous questions in 12 languages, designed to expose implicit LLM biases related to laws, dates, and measurements by observing responses to queries lacking explicit locale indicators.

In practice

Evaluate LLMs for US-locale bias.
Assess instruction tuning's bias impact.
Consider demographic priors in LLM outputs.

Topics

Multilingual LLMs
Implicit Bias
LocQA Benchmark
Inter-lingual Bias
Intra-lingual Bias

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, NLP Engineer, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.