Evidence Slopes and Effective Dimension in Singular Linear Models
Summary
This research investigates the limitations of standard Bayesian model selection approximations, Laplace's method and BIC, in singular linear models. These approximations incorrectly assume the effective model dimension equals the number of parameters ($d$), whereas singular learning theory posits the real log canonical threshold (RLCT) $\lambda$ as the true effective dimension, which can be smaller in overparameterized or low-rank models. The study focuses on linear-Gaussian rank models, where RLCT and exact marginal likelihood are analytically tractable. It demonstrates that Laplace/BIC errors grow linearly with $(d/2-\lambda)\log n$, while an RLCT-aware correction accurately recovers the correct slope. The findings also show that RLCT and evidence are invariant to overcomplete reparameterizations with the same data subspace, unlike BIC, and propose interpreting the evidence versus $\log n$ slope as a practical RLCT estimator.
Key takeaway
For research scientists working with overparameterized or low-rank models, you should be aware that traditional Laplace/BIC approximations can significantly misestimate model complexity. Consider adopting RLCT-aware corrections, especially when dealing with singular models where the intrinsic rank is less than the ambient parameter dimension, to ensure accurate Bayesian evidence calculations and avoid biased model comparisons. This approach will provide more reliable insights into your model's true complexity.
Key insights
Singular models require RLCT for accurate Bayesian evidence, as Laplace/BIC over-penalize by $(d/2-\lambda)\log n$.
Principles
- Effective dimension is RLCT, not parameter count.
- RLCT-based penalties are representation-invariant.
- Laplace/BIC errors grow with $(d-r)\log n/2$ in singular models.
Method
Estimate RLCT by observing the slope of evidence versus $\log n$ in linear-Gaussian models, particularly for rank-deficient regression and linear dictionary models.
In practice
- Use RLCT-aware corrections for singular models.
- Avoid BIC for overcomplete representations.
- Analyze evidence slopes to infer effective dimension.
Topics
- Singular Learning Theory
- Real Log Canonical Threshold
- Bayesian Model Selection
- Overparameterized Models
- Evidence Slopes
Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.