Jensen's Inequality - Why the Average of a Curve is Not the Curve of the Average

· Source: DataMListic · Field: Science & Research — Mathematics & Computational Sciences, Artificial Intelligence & Machine Learning · Depth: Intermediate, short

Summary

The concept of convexity, defined by a curve consistently bending upward like a bowl, is introduced using the "chord test." A function is convex if any chord drawn between two points on its graph always lies above or on the curve. Examples include $x^2$, $a^x$, and $|x|$. Concave functions, like the logarithm, exhibit the opposite behavior where the chord lies below the curve. This principle extends to Jensen's inequality, which generalizes the chord test for multiple points weighted by probabilities. It states that for a convex function $f$, $f(E[x]) \le E[f(x)]$, meaning the function of the expectation is less than or equal to the expectation of the function. This inequality flips for concave functions. A concrete example demonstrates that for $f(x) = x^2$, the Jensen gap, $E[f(x)] - f(E[x])$, precisely equals the variance of $x$, illustrating how curvature quantifies spread.

Key takeaway

For data scientists and machine learning engineers working with probabilistic models, understanding Jensen's inequality is crucial. It provides a fundamental tool for deriving bounds and inequalities, such as the non-negativity of KL divergence or the validity of the ELBO in variational inference. Your ability to recognize convex or concave functions will directly inform how you interpret and apply expected values in complex systems, ensuring accurate model formulation and analysis.

Key insights

Convexity and Jensen's inequality reveal how function curvature dictates the relationship between function of expectation and expectation of function.

Principles

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by DataMListic.