Using Transformers to Forecast Incredibly Rare Solar Flares

· Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Space Science & Astronomy · Depth: Advanced, long

Summary

Machine learning forecasting for rare events, such as solar flares, requires a fundamental shift in modeling approach, moving beyond standard metrics like accuracy. The 2003 Halloween storms, an X-45 class solar flare 450 times stronger than an M-1 flare, exemplify the impact of such events, causing satellite malfunctions, GPS disruptions, and power outages like the Malmö Blackout. Predicting these rare, high-impact events is challenging due to their infrequency, necessitating specialized techniques. Data for solar flare prediction is collected from the Sun's photosphere by NASA's Solar Dynamics Observatory (SDO) using the Helioseismic and Magnetic Imager (HMI), despite flares occurring in the corona and chromosphere. This data undergoes localization and feature engineering to derive magnetic parameters like flux, current, twist, and helicity, which describe solar and magnetic structures. The model targets the probability of an M-1 class event within 24 hours, using the True Skill Statistic (TSS) as a more appropriate metric than accuracy for imbalanced datasets. A "tail model" is developed to focus on extreme events, using the Generalized Pareto Distribution (GPD) to model exceedances beyond a defined threshold. This tail model is then combined with a full distribution model using a transformer with multiple output heads to predict both flare occurrence and severity.

Key takeaway

For Machine Learning Engineers developing models for rare, high-impact events like solar flares, relying solely on accuracy is insufficient and misleading. You should prioritize metrics like the True Skill Statistic (TSS) and implement a "tail model" using distributions such as the Generalized Pareto Distribution (GPD) to specifically capture extreme event behavior. This approach, integrated with a multi-head transformer, will improve both prediction accuracy for rare occurrences and the estimation of event severity, leading to more robust and actionable forecasts.

Key insights

Predicting rare, high-impact events requires specialized ML techniques, including tail modeling and appropriate metrics.

Principles

Method

The proposed method involves feature engineering from magnetic data, defining a precise target (M-1 class event probability), using TSS for evaluation, building a tail model with GPD for exceedances, and integrating these via a transformer with multiple heads.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.