Emergent Ordinal Geometry in Transformers Trained on Local Comparisons

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

Transformers trained exclusively on adjacent comparisons from a hidden total order can acquire transitive inference, a capability previously observed in humans and animals. Researchers found that small models, when evaluated on unseen distant pairs, exhibited out-of-distribution generalization. This emergence was accompanied by a significant geometric reorganization where entity embeddings converged onto a one-dimensional manifold. The principal axis of this manifold accurately recovered the hidden rank order with near-perfect fidelity. This learned structure also displayed sensitivity to optimization, leading to grokking-like transient dynamics. Crucially, even at peak accuracy, both decision confidence and geometric separation increased monotonically with rank distance, directly replicating the symbolic distance effect seen in biological cognition for over 50 years. This research offers a mechanistic explanation for transitive inference, connecting cognitive science with modern neural networks.

Key takeaway

For AI Scientists exploring cognitive capabilities in neural networks, this research suggests that complex inference, like transitive reasoning, can emerge from simple local training. You should consider designing experiments that probe the geometric properties of learned embeddings, as these can reveal fundamental mechanisms mirroring human cognition. This approach offers a pathway to developing more robust and interpretable AI systems by understanding how abstract concepts are represented.

Key insights

Transformers trained on local comparisons develop a geometric representation mirroring human transitive inference and the symbolic distance effect.

Principles

Method

Small Transformers were trained on adjacent comparisons from a hidden total order. Generalization to unseen distant pairs was evaluated, observing embedding geometry and decision confidence.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.