The Geometry Hidden in Every Model

· Source: Agus’s Substack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, long

Summary

The first post in The Learned Kernel series argues that every supervised model's prediction is a weighted average of training labels, where the weights define a "geometry" representing similarity. This concept is demonstrated using ridge regression, regression trees, and k-nearest neighbors on California housing data, yielding different predictions like \$306k, \$370k, and \$321k for a single block. Each model employs a distinct weighting function: ridge uses a global, signed elliptical influence; trees use local, axis-aligned boxes; and k-NN uses local, isotropic balls. The article posits that learning is the discovery of this geometry, which it terms the "kernel," and emphasizes that this kernel should be learned from data rather than pre-chosen.

Key takeaway

For Machine Learning Engineers evaluating model choices, understand that different algorithms imply distinct "geometries" of similarity, not just competing predictions. You should analyze a model's underlying weight function, or "kernel," to grasp how it defines relevance and influence from training data. This deeper understanding helps you move beyond arbitrary parameter tuning (like k in k-NN) towards learning the optimal geometry directly from your data, leading to more trustworthy and accurate models.

Key insights

Learning in supervised models is the discovery of a geometric kernel that defines similarity and weights training labels.

Principles

Method

Supervised models compute predictions as weighted sums of training labels, y^(x)=∑i=1nwi(x)yi, where wᵢ(x) is the influence of training case i on query x, effectively defining the model's geometric kernel.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.