H-Probes: Extracting Hierarchical Structures From Latent Representations of Language Models
Summary
Researchers developed H-probes, a set of linear probes designed to extract hierarchical structure, specifically tree depth and pairwise distance, from the latent representations of large language models (LLMs). Experiments on synthetic binary tree traversal tasks demonstrated that H-probes robustly identify low-dimensional subspaces containing hierarchical structure. These subspaces are causally important for high task performance, generalize both within- and out-of-domain (to deeper trees), and remain stable across different training sets and model scales (1.5B, 7B, 14B Qwen reasoning models). Analogous, though weaker, hierarchical structures were also found in real-world contexts like mathematical reasoning traces (GSM8K) and HiBench tasks. The findings suggest that LLMs represent hierarchy not only at syntactic and conceptual levels but also at deeper levels of abstraction, including the reasoning process itself.
Key takeaway
For research scientists investigating LLM interpretability, understanding how models represent hierarchical reasoning is crucial. This work demonstrates that hierarchical structures are geometrically encoded in low-dimensional latent subspaces and are causally linked to task performance. You should consider employing probing frameworks like H-probes to identify and analyze these structures, especially when working with models performing complex, multi-step tasks, to gain insights into their internal computational mechanisms and improve alignment and control.
Key insights
LLMs geometrically represent hierarchical structures like tree depth and pairwise distance in low-dimensional latent subspaces.
Principles
- Hierarchical representations are causally important for task success.
- Identified hierarchical structures generalize across domains and model scales.
- Pairwise distance is more robustly represented than absolute depth.
Method
H-probes use PCA-reduced latent space to train linear probes for tree distance (Euclidean distance in a projected subspace) and tree depth (ridge regression on a linear direction), followed by causal ablation experiments.
In practice
- Use H-probes to analyze hierarchical reasoning in LLMs.
- Ablate hierarchical subspaces to test causal importance.
- Focus on middle-late layers for strongest hierarchical signals.
Topics
- H-probes
- Hierarchical Reasoning
- Latent Representations
- Language Models
- Subspace Ablation
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.