Science Needs AI Data Stocktakes
Summary
The article advocates for "AI data stocktakes" to accelerate scientific discovery, particularly in fusion energy. It highlights that projects like the Joint European Torus (JET), which concluded in 2023 after hitting over 150 million degrees Celsius, generated vast, yet largely raw and inaccessible, data. This data, termed a "stranded asset," could train AI models to improve fusion simulations, experiments, data quality, and underlying technologies. The UK government's AI for Science strategy and the US Genesis Mission are noted as promising initiatives. The authors conducted a proof-of-concept stocktake for fusion by interviewing 25 experts, identifying challenges like technical, bureaucratic, and human/cultural "debts" hindering data utilization, and propose eight recommendations.
Key takeaway
For research scientists and policymakers aiming to accelerate fusion energy development, you should prioritize funding and implementing "AI data stocktakes." These structured assessments will identify critical, actionable data gaps and opportunities, enabling AI models to improve plasma control and simulation accuracy. Focus on initiatives like open-sourcing legacy data from projects such as JET and modernizing simulation codes to unlock AI's full potential in this critical field.
Key insights
Scientific progress requires structured "AI data stocktakes" to identify and address data obstacles for AI model training.
Principles
- Raw scientific data often becomes a "stranded asset."
- Data stocktakes prioritize 1-2 year fundable projects.
- AI can predict, control, and understand complex plasma behavior.
Method
Conduct expert interviews in a scientific field to identify AI opportunities, data obstacles, and high-impact interventions fundable within 1-2 years.
In practice
- Open source experimental data, like FAIR MAST.
- Modernize simulation codes for AI surrogate models.
- Fund competitions for AI plasma disruption prediction.
Topics
- AI for Science
- Fusion Energy
- Data Stocktakes
- Plasma Physics
- Experimental Data
- Simulation Data
Code references
Best for: AI Scientist, Research Scientist, Policy Maker
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Policy Perspectives.