Occam's Razor is Only as Sharp as Your ELBO

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, long

Summary

A study by Ethan Harvey and Michael C. Hughes from Tufts University demonstrates that the Evidence Lower Bound (ELBO) in variational inference, often considered a mathematical embodiment of Occam's razor, can lead to overfitting or underfitting depending on the assumed rank of the covariance matrix in a Gaussian approximate posterior. While prior work showed mean-field approximations causing underfitting, this research presents a clear case of ELBO-based overfitting in an over-parameterized Bayesian linear regression model. Specifically, using a rank-1 covariance matrix for the approximate posterior leads to overfitting by systematically underestimating the likelihood variance, especially when the number of parameters (R) exceeds the number of data points (N). Conversely, a diagonal covariance leads to underfitting. Surprisingly, Bayesian model selection via the exact log-marginal likelihood (LML) sometimes prefers the overfit option over the underfit one, a preference not shared by the ELBO.

Key takeaway

For research scientists developing scalable Bayesian models, you must carefully consider the implications of reduced-rank approximations for approximate posteriors. Your choice of covariance structure (e.g., diagonal vs. rank-1) directly impacts whether ELBO-based hyperparameter learning will underfit or overfit, potentially leading to suboptimal model selection. Be cautious, as the exact marginal likelihood might even favor an overfit model that the ELBO rejects.

Key insights

ELBO's model selection behavior depends critically on the approximate posterior's covariance structure.

Principles

Method

The study uses a Bayesian linear regression model with Gaussian approximate posteriors, varying covariance matrix ranks (diagonal, rank-1, full-rank) to observe ELBO behavior.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.