ShapleyLaw: A Game-Theoretic Approach to Multilingual Scaling Laws

2026-03-18 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

ShapleyLaw introduces a game-theoretic approach to multilingual scaling laws, addressing the limitation of current methods that fail to quantify cross-lingual transfer effects in pretraining. This new model considers multilingual pretraining as a cooperative game where each language acts as a player contributing to a joint reduction in test loss. By applying cooperative game theory, ShapleyLaw quantifies the cross-lingual transfer from each language based on its contribution to this "game." The proposed method aims to predict test loss more accurately under varying language mixture ratios and, consequently, to estimate optimal ratios for pretraining data. Experimental results indicate that ShapleyLaw surpasses baseline methods in both model performance prediction and language mixture optimization.

Key takeaway

For AI Scientists and Research Scientists developing multilingual models, understanding and applying ShapleyLaw can significantly improve pretraining efficiency. By accurately quantifying cross-lingual transfer, you can optimize language mixture ratios in your datasets, leading to better model performance and more efficient resource allocation. Consider integrating game-theoretic approaches like ShapleyLaw into your data preparation and model training pipelines to achieve superior multilingual model outcomes.

Key insights

ShapleyLaw uses game theory to quantify cross-lingual transfer for optimizing multilingual model pretraining.

Principles

Cross-lingual transfer is a quantifiable contribution.
Multilingual pretraining is a cooperative game.

Method

ShapleyLaw models multilingual pretraining as a cooperative game, quantifying each language's cross-lingual transfer contribution to test loss reduction using game theory principles.

In practice

Optimize language mixture ratios for pretraining.
Improve multilingual model performance prediction.

Topics

Multilingual Scaling Laws
Game Theory
Cross-lingual Transfer
Language Mixture Optimization
Pretraining Data

Best for: AI Scientist, Research Scientist, AI Researcher, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.