mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR

2026-03-13 · Source: Apple Machine Learning Research · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Speech and Natural Language Processing · Depth: Advanced, quick

Summary

Apple researchers have released mAceReason-Math, a new dataset designed to address the scarcity of high-quality, multilingual math problems suitable for Reinforcement Learning with Verifiable Rewards (RLVR). Published in March 2026, this dataset comprises over 10,000 challenging math problems per language across 14 languages. It was created by carefully translating and cleaning problems from the English-centric AceReason-Math corpus, which was specifically curated for RLVR. The initiative aims to overcome the limitations of existing multilingual datasets, which are often too low in difficulty to provide effective training signals for current large language models, thereby facilitating advanced multilingual RLVR research and benchmarking.

Key takeaway

For NLP engineers and research scientists developing or evaluating multilingual large language models, mAceReason-Math offers a critical resource. This dataset directly supports improving model capabilities in math and logic across 14 languages, addressing a significant gap in high-quality, challenging training data for RLVR. Integrating this dataset into your training pipelines can help mitigate English-centric biases and enhance the naturalness and reasoning abilities of your models in non-English contexts.

Key insights

mAceReason-Math provides a high-quality, multilingual dataset for advanced RLVR in math and logic.

Principles

Multilingual LLMs require high-quality, diverse training data.
RLVR benefits from challenging, verifiable problem sets.

Method

The dataset was created by translating challenging math problems from an English RLVR corpus (AceReason-Math), followed by meticulous cleaning and improvement of translations across 14 languages.

In practice

Use mAceReason-Math for multilingual LLM training.
Benchmark RLVR models across 14 languages.

Topics

Reinforcement Learning with Verifiable Rewards
Multilingual Language Models
Math Reasoning Datasets
Natural Language Processing
Dataset Curation

Code references

apple/ml-macereason-math

Best for: NLP Engineer, Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Apple Machine Learning Research.