Trajectory Geometry of Transformer Representations Across Layers

2026-06-08 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new study introduces "trajectory geometry" as a probe-free lens for mechanistic interpretability, analyzing how Transformer representations evolve across layers. By recasting the forward pass as a discrete population trajectory, researchers applied five geometric metrics: length, curvature, semantic convergence, layerwise cosine similarity, and representational stability. Across GPT-2, TinyLlama, and Qwen2.5 models with five prompt families, four key findings emerged. Semantically related prompts converged significantly in middle-to-late layers (peak CI 0.41-0.58, p<0.001). Reasoning tasks produced trajectories with greater curvature (0.71-0.83 rad) compared to lexical variations (0.27-0.31 rad). Ambiguous tokens exhibited trajectory bifurcation, showing up to 5.6x representational separation by the final layer. Finally, a universal three-phase structure—encoding, elaboration, and output preparation—was observed via layerwise cosine similarity. A fully open-source, model-agnostic pipeline is released.

Key takeaway

For AI scientists focused on mechanistic interpretability, this research offers a novel, probe-free approach to understanding Transformer internal workings. You should consider integrating trajectory geometry analysis into your model development and debugging workflows, especially when diagnosing representational ambiguity or optimizing for specific computational complexities. This method provides a principled lens to observe how your models process information layer-by-layer, potentially revealing insights beyond traditional feature probing.

Key insights

Trajectory geometry offers a principled, probe-free lens to understand Transformer representation evolution across layers.

Principles

Transformer representations evolve as discrete population trajectories.
Trajectory geometry metrics reveal internal computational dynamics.
Semantic convergence and bifurcation indicate processing stages.

Method

Recasting the Transformer forward pass as a discrete population trajectory, five geometric metrics are computed directly in ambient space: length, curvature, semantic convergence index, layerwise cosine similarity, and representational stability.

In practice

Use trajectory geometry for mechanistic interpretability.
Apply the open-source pipeline to analyze models.
Investigate specific layer dynamics for task performance.

Topics

Transformer Representations
Mechanistic Interpretability
Trajectory Geometry
Neural Network Dynamics
Large Language Models
Model Analysis Pipeline

Best for: Research Scientist, AI Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.