Learning Probabilistic Filters with Strictly Proper Scoring Rules
Summary
The proper scoring ensemble filter (PSEF) is introduced as an ensemble data assimilation method designed to approximate the Bayesian filtering distribution for partially and noisily observed dynamical systems. PSEF trains a permutation-invariant, transformer-based analysis map using synthetic state-observation trajectories. Its training objective relies on strictly proper scoring rules, specifically the energy score, to reward probabilistic accuracy across the entire distribution. The authors prove that, under a realizability assumption, the population objective is minimized by the true Bayesian filtering distribution. Numerical experiments demonstrate PSEF's ability to accurately approximate challenging filtering distributions, including nonlinear, non-Gaussian, and multi-modal posteriors. It achieves stronger performance in data assimilation tasks compared to classical methods or learning-based methods using mean-squared-error objectives. For close-to-Gaussian problems, learning a correction to the EnKF is optimal, while highly non-Gaussian scenarios benefit from an end-to-end approach.
Key takeaway
For research scientists developing Bayesian filtering solutions for complex dynamical systems, consider PSEF for its ability to accurately model nonlinear, non-Gaussian, and multi-modal posteriors. If your system is close-to-Gaussian, prioritize learning a correction to the EnKF. However, for highly non-Gaussian problems, an end-to-end learning approach without inductive bias, like PSEF, will yield superior data assimilation performance.
Key insights
PSEF uses strictly proper scoring rules and Transformers to learn accurate Bayesian filtering distributions from synthetic data.
Principles
- Probabilistic accuracy is rewarded over the whole distribution.
- True Bayesian filtering minimizes the population objective.
- Inductive bias matters for Gaussian vs. non-Gaussian problems.
Method
PSEF trains a permutation-invariant, transformer-based analysis map using synthetic state-observation trajectories and strictly proper scoring rules (energy score).
In practice
- Apply PSEF for nonlinear, non-Gaussian, multi-modal posteriors.
- Use EnKF correction for close-to-Gaussian problems.
- Employ end-to-end learning for highly non-Gaussian systems.
Topics
- Bayesian Filtering
- Ensemble Data Assimilation
- Proper Scoring Rules
- Transformers
- Non-Gaussian Systems
- Dynamical Systems
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.