AI’s fluency in other languages hides a Western worldview that can mislead users − a scholar of Indonesian society explains
Summary
Large language models (LLMs) like ChatGPT, Claude, and Gemini, despite their fluency in numerous languages, exhibit "epistemological persistence," meaning they retain a Western worldview rooted in American cultural assumptions. This phenomenon, identified through research published in the International Review of Modern Sociology, occurs because LLMs are predominantly trained on English-language data, with LLaMA 2 using 89.7% English and LLaMA 3 about 95% English. Even when prompted in other languages, LLMs often conduct their core reasoning in English and then translate the output, creating an illusion of local understanding. Experiments with Indonesian concepts like "pendidikan" (education) and "malu" (social awareness/shame) showed that AI responses consistently prioritized individual autonomy and psychological framing over collective, relational, or ethical dimensions emphasized in Indonesian traditions. This cultural bias is largely due to the high cost of developing region-specific models versus cheaper translation-based approaches.
Key takeaway
For AI Product Managers developing global applications, recognize that multilingual LLMs often embed a Western worldview, even when fluent in local languages. Your systems may deliver culturally misaligned advice or information, particularly in sensitive areas like family dynamics or education. You should implement rigorous cultural validation testing beyond mere linguistic accuracy to prevent unintended propagation of foreign cultural norms and ensure appropriate user guidance.
Key insights
LLMs exhibit "epistemological persistence," retaining a Western worldview despite multilingual fluency due to English-centric training data.
Principles
- Fluency does not equate to cultural understanding.
- Training data shapes underlying worldview and reasoning.
- Cost drives reliance on translation over localized models.
Method
Experiments involved asking ChatGPT, Claude, and Gemini questions in English and Indonesian about concepts like education, responsibility, well-being, and untranslatable Indonesian terms, then analyzing responses for cultural alignment.
In practice
- Test AI responses for cultural bias in critical applications.
- Be aware of hidden Western assumptions in multilingual AI.
- Prioritize localized models for culturally sensitive tasks.
Topics
- Large Language Models
- Cultural Bias
- Epistemological Persistence
- Multilingual AI
- Training Data
Code references
Best for: NLP Engineer, AI Product Manager, AI Scientist, AI Ethicist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial intelligence (AI) – The Conversation.