Loss-Shift Transfer via Bayes Quotients

2026-06-11 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new research paper identifies an orthogonal failure mode called "loss shift," distinct from traditional distribution shift, where the data distribution remains fixed but the loss function changes. This setting implies that different representations may be required even under the same joint law P(X,Y) because a loss determines Bayes-relevant information in X. The concept is formalized using Bayes quotients, which allow losses to be ordered by refinement. A key finding is that a source-minimal representation for a coarser loss is insufficient for a strictly finer target loss. For finite-output log loss, this obstruction quantifies as the excess risk being the conditional information about Y discarded by the representation. Experiments across controlled, learned, synthetic-image, and real-image settings confirm this predicted effect, demonstrating that classification-equivalent representations can exhibit different optimal log-loss performance under a fixed data distribution.

Key takeaway

For AI Scientists optimizing models, if you are changing your loss function while the data distribution remains fixed, recognize that your current representations may become insufficient. This "loss shift" necessitates re-evaluating or re-learning representations to avoid suboptimal performance, even if classification-equivalent. Account for the Bayes-relevant information dictated by the new loss, as a source-minimal representation for a coarser loss will not suffice for a strictly finer target loss.

Key insights

Loss shift, a distinct failure mode from distribution shift, occurs when the loss function changes, necessitating different data representations.

Principles

Loss functions dictate Bayes-relevant information in X.
Coarser loss representations fail for finer target losses.
Excess risk quantifies discarded conditional Y information.

Method

The paper formalizes loss shift using Bayes quotients to order losses by refinement, identifying when source-minimal representations become insufficient for finer target losses.

In practice

Expect varying log-loss performance from equivalent representations.
Consider loss function changes even with fixed data.

Topics

Loss Shift
Bayes Quotients
Transfer Learning
Representation Learning
Log Loss
Machine Learning Theory

Best for: Research Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.