Boundary Variance Inflation Causes Acquisition Bias in Gaussian Processes

· Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Gaussian processes with stationary kernels on bounded domains exhibit inflated posterior variance near the boundary, causing acquisition bias in Bayesian optimization and experimental design. This artifact stems from the truncation of the kernel correlation neighborhood at the domain boundary, an effect that intensifies with dimensionality. The research characterizes how this distortion impacts three acquisition classes: variance maximization (VM) concentrates selections at corners, while negative integrated posterior variance (NIPV) and expected predictive information gain (EPIG) shift selections inward to axis-aligned interior shells. These patterns emerge independently of any objective function. The authors introduce a function-free "selection-profile diagnostic" to quantify this bias across arbitrary acquisitions, kernels, and domain geometries, validating it against sequential acquisition. A Neumann (mirror-image) kernel partially mitigates this bias, reducing VM's boundary concentration by 7% at D=2 and 28% at D=6.

Key takeaway

For Machine Learning Engineers designing Bayesian optimization or experimental design systems on bounded domains, you must account for boundary variance inflation. This geometric artifact can cause acquisition functions like variance maximization to over-explore corners, or NIPV/EPIG to favor interior shells, irrespective of the objective function. Use the selection-profile diagnostic to evaluate your chosen acquisition functions and consider Neumann kernels to reduce boundary bias, especially at higher dimensions.

Key insights

Gaussian Process posterior variance inflates at domain boundaries due to kernel truncation, causing acquisition bias independent of objective functions.

Principles

Method

The "selection-profile diagnostic" quantifies acquisition bias by tracking the empirical distribution of argmax boundary distances over many independent training sets, reweighting by shell volume.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.