Learning Probabilistic Filters with Strictly Proper Scoring Rules

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences, Data Science & Analytics · Depth: Expert, quick

Summary

The proper scoring ensemble filter (PSEF) is introduced as an ensemble data assimilation method designed to approximate the Bayesian filtering distribution for partially and noisily observed dynamical systems. PSEF trains a permutation-invariant, transformer-based analysis map using synthetic state-observation trajectories. Its training objective relies on strictly proper scoring rules, specifically the energy score, to reward probabilistic accuracy across the entire distribution. The authors prove that, under a realizability assumption, the population objective is minimized by the true Bayesian filtering distribution. Numerical experiments demonstrate PSEF's ability to accurately approximate challenging filtering distributions, including nonlinear, non-Gaussian, and multi-modal posteriors. It achieves stronger performance in data assimilation tasks compared to classical methods or learning-based methods using mean-squared-error objectives. For close-to-Gaussian problems, learning a correction to the EnKF is optimal, while highly non-Gaussian scenarios benefit from an end-to-end approach.

Key takeaway

For research scientists developing Bayesian filtering solutions for complex dynamical systems, consider PSEF for its ability to accurately model nonlinear, non-Gaussian, and multi-modal posteriors. If your system is close-to-Gaussian, prioritize learning a correction to the EnKF. However, for highly non-Gaussian problems, an end-to-end learning approach without inductive bias, like PSEF, will yield superior data assimilation performance.

Key insights

PSEF uses strictly proper scoring rules and Transformers to learn accurate Bayesian filtering distributions from synthetic data.

Principles

Method

PSEF trains a permutation-invariant, transformer-based analysis map using synthetic state-observation trajectories and strictly proper scoring rules (energy score).

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.