Contribution Weights: A Geometrical Analysis of Self-Attention Transformers

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

"Contribution Weights" is a novel projection-based metric introduced for interpreting information flow within Self-Attention Transformers in Large Language Models (LLMs). This metric quantifies a token's influence by considering its attention weight, value magnitude, and directional alignment with the layer output, thereby addressing the limitations of traditional attention weights that overlook the geometric properties of value vectors. The research demonstrates that Contribution Weights provide a more faithful measure of token importance, consistently outperforming attention-based metrics in identifying semantically critical tokens across diverse decoder-only models, tasks, and datasets. Furthermore, this metric facilitates a new mechanistic analysis of "attention sinks," revealing they actively suppress information via a convex relationship between sink rate and output norm, stabilizing representations by counteracting the semantic drift of low-confidence tokens.

Key takeaway

For Machine Learning Engineers focused on LLM interpretability and debugging, traditional attention weights offer an incomplete view of token importance. You should consider adopting "Contribution Weights" to gain a more faithful understanding of information flow, as this metric accounts for value vector geometry. This approach will enable you to more accurately identify semantically critical tokens and mechanistically analyze attention sink functions, potentially leading to improved model stability and more robust LLM designs.

Key insights

Contribution Weights offer a geometrically-informed, more faithful measure of token importance and reveal active roles for attention sinks in LLMs.

Principles

Method

The paper introduces Contribution Weights, a projection-based metric, to quantify token influence by combining attention weight, value magnitude, and directional alignment with the layer output.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.