LLM Parameters for Math Across Languages: Shared or Separate?

2026-01-26 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

This study investigates whether mathematical reasoning parameters in multilingual Large Language Models (LLMs) are shared across languages or language-specific. Researchers analyzed Llama 1B, Qwen3 4B, and Llama 8B models across English, German, French, and Hindi using the MathNeurosurgery framework and Jaccard similarity. They found that math-associated parameters exhibit partial cross-lingual overlap, predominantly in intermediate model layers. English consistently showed the largest set of math-relevant parameters, correlating with its stronger reasoning performance, while lower-resource languages had smaller sets. The findings suggest that math capabilities are neither fully language-invariant nor entirely language-specific, but rather a blend with systematic language-dependent differences. Intervention experiments confirmed these parameters' collective influence, with scaling primarily correcting arithmetic errors.

Key takeaway

For AI Scientists and Machine Learning Engineers developing multilingual LLMs, understanding the language-dependent nature of mathematical reasoning parameters is crucial. You should consider that English-centric models may not generalize efficiently to other languages, especially those with different scripts like Hindi. Focus on optimizing intermediate model layers for cross-lingual math capabilities and explore targeted parameter interventions, such as pruning, to refine output formatting and in-context learning for specific language tasks.

Key insights

Multilingual LLMs exhibit partial, layer-dependent parameter overlap for math, with English dominating.

Principles

Math-specific parameters show partial cross-lingual overlap.
Overlap is strongest in intermediate LLM layers.
English-centric pathways often dominate multilingual reasoning.

Method

Identify math-specific parameters using the MathNeurosurgery framework, comparing weight-activation products on math vs. non-math datasets. Measure cross-lingual overlap with the Jaccard coefficient.

In practice

Focus optimization efforts on intermediate layers for multilingual math.
Consider language-specific parameter tuning for lower-resource languages.
Investigate pruning for output formatting improvements in specific tasks.

Topics

Multilingual LLMs
Mathematical Reasoning
Parameter Localization
Cross-lingual Transfer
Model Interpretability
Weight Pruning
Jaccard Similarity

Code references

luisavictor/math-across-languages

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.