A Hitchhiker's Guide to Poisson Gradient Estimation

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computational Neuroscience · Depth: Expert, extended

Summary

This paper introduces a modified Exponential Arrival Time (Eat) method, termed Eat${}_{ extsf{cubic}}$, for differentiating through discrete Poisson-distributed latent variables in models like Variational Autoencoders (VAEs) and Partially Observable Generalized Linear Models (POGLMs). The original Eat method, Eat${}_{ extsf{sigmoid}}$, and the Gumbel-SoftMax (Gsm) relaxation struggle with hyperparameter sensitivity and distributional fidelity. Eat${}_{ extsf{cubic}}$ replaces the sigmoid approximation with a cubic Hermite interpolant, which has compact support, theoretically guaranteeing an unbiased first moment and reducing second-moment bias. Empirical evaluations across distributional fidelity (Wasserstein-1 distance), gradient quality (BiasEnergy, NoiseEnergy, Cosine Similarity), and performance on $\operatorname{\mathcal{P}}$-VAE and POGLM tasks demonstrate that Eat${}_{ extsf{cubic}}$ consistently outperforms Eat${}_{ extsf{sigmoid}}$ and Gsm, exhibiting superior robustness to temperature hyperparameter choices and often matching exact gradients.

Key takeaway

For research scientists developing or applying Poisson latent variable models, you should adopt the new Eat${}_{ extsf{cubic}}$ relaxation method. Its superior distributional fidelity and robustness to temperature hyperparameter choices will significantly reduce the need for costly grid searches and lead to more stable and reliable training, ultimately improving model performance and generalization in NeuroAI applications.

Key insights

Eat${}_{ extsf{cubic}}$ offers a robust, unbiased method for Poisson gradient estimation by using cubic smoothstep.

Principles

Method

The Eat${}_{ extsf{cubic}}$ method replaces the sigmoid function in the Exponential Arrival Time (Eat) relaxation with a cubic Hermite interpolant (smoothstep) to ensure compact support, thereby reducing bias in moment estimation and improving temperature robustness.

In practice

Topics

Code references

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.