TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Science & Research — Physical Sciences & Chemistry, Engineering & Applied Sciences, Research Methodology & Innovation · Depth: Expert, extended

Summary

TokaMark is a new, open-source benchmark designed to evaluate AI models for predicting plasma dynamics in fusion energy reactors, specifically using real experimental data from the Mega Ampere Spherical Tokamak (MAST). It addresses the lack of curated datasets and standardized benchmarks in fusion AI. TokaMark unifies access to multi-modal, heterogeneous fusion data, harmonizes formats, and provides 14 tasks across four groups: equilibrium reconstruction, magnetics dynamics, profile dynamics, and MHD activity. The benchmark includes a multi-branch convolutional encoder-decoder baseline model, trained on 11,573 shots from the FAIR-MAST dataset with an 80%/10%/10% split. It uses a hierarchical evaluation protocol and provides Python tools for data loading and processing.

Key takeaway

For AI Scientists and Machine Learning Engineers developing models for fusion energy, TokaMark offers a critical standardized platform. You should use this open benchmark to rigorously compare your models against established baselines. Focus on tasks like profile dynamics and MHD activity. The baseline shows significant room for improvement in these areas, with NRMSE scores exceeding 0.17; task 4-5 even exceeds unity. This will accelerate the development of robust, data-driven plasma models essential for commercially viable fusion.

Key insights

TokaMark provides a unified, open benchmark with 14 tasks and a baseline for AI models in fusion plasma modeling.

Principles

Fusion data is multi-modal, multi-rate, and often incomplete.
AI models can learn latent plasma representations from raw data.
Standardized benchmarks accelerate progress and ensure reproducibility.

Method

TokaMark defines 14 tasks with input/output windows, uses a sliding-window approach with a 0.001-second stride, and employs a hierarchical evaluation protocol (samples -> windows -> signals -> tasks -> shots) with NRMSE.

In practice

Evaluate AI models on 14 diverse fusion plasma tasks.
Utilize the provided multi-branch convolutional baseline.
Develop models robust to multi-fidelity and missing data.

Topics

TokaMark
Fusion Energy
Tokamak Plasma Modeling
AI Benchmarking
Multi-modal Data
MAST Tokamak
Plasma Diagnostics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.