Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

· Source: stat.ML updates on arXiv.org · Field: Science & Research — Physical Sciences & Chemistry, Mathematics & Computational Sciences, Research Methodology & Innovation · Depth: Expert, extended

Summary

This paper introduces a unified Bayesian Optimization (BO) framework using Gaussian Process Regression (GPR) to accelerate stationary point searches on potential energy surfaces (PES), crucial for understanding chemical reactions and material properties. The framework unifies minimization, single-point saddle searches (Dimer method), and double-ended saddle searches (Nudged Elastic Band, NEB) through a six-step surrogate loop. It employs GPR with derivative observations, inverse-distance kernels, and active learning. Key extensions include Optimal Transport GP (OT-GP) with farthest point sampling (FPS) using Earth Mover's Distance (EMD), MAP regularization via variance barriers, oscillation detection, and an adaptive trust radius. Random Fourier features (RFF) are also integrated to improve scaling for high-dimensional systems. The accompanying Rust code, "chemgp-core," demonstrates practical implementation, bridging theoretical formulation with executable code and showing significant reductions in expensive electronic structure evaluations (e.g., 5-10x for NEB, 10x for Dimer) while maintaining accuracy.

Key takeaway

For computational chemists and materials scientists performing PES explorations, this GPR-accelerated framework offers a robust method to drastically reduce computational cost. By leveraging local surrogates and active learning, you can achieve 5-10x fewer expensive electronic structure calculations for saddle point searches and minimizations. Consider adopting this unified Bayesian optimization approach, especially for large systems or high-throughput workflows, to accelerate your research without sacrificing accuracy.

Key insights

GPR with active learning and inverse-distance kernels significantly accelerates stationary point searches on potential energy surfaces.

Principles

Method

The unified six-step Bayesian surrogate loop involves training a GP, optimizing on the surrogate, checking trust constraints, evaluating the oracle, selecting the next query point, and updating the training set.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.