What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

Researchers introduce "prolepsis," a phenomenon in transformers where an early, irrevocable commitment to a decision is made, sustained by task-specific attention heads, and not corrected by subsequent layers. Replicating a prior finding on planning sites, the study investigates this behavior in open models like Gemma 2 2B and Llama 3.2 1B. Key findings include that planning is invisible to six residual-stream methods, requiring Causal Tracing (CLTs) for detection, and that the planning-site spike exhibits identical geometry across models. Specific attention heads are identified as routing the decision to the output, a mechanism previously unobservable via attribution graphs. The study also found that search requires fewer layers (≤16) than commitment, and factual recall exhibits a similar proleptic motif at a different network depth, with distinct planning heads. All experiments were conducted on a single 16 GB VRAM consumer GPU.

Key takeaway

For research scientists investigating transformer decision-making, understanding prolepsis reveals that early, irrevocable commitments are an architectural feature. You should focus on Causal Tracing (CLTs) to detect planning sites and analyze specific attention heads to trace how decisions are routed, as traditional attribution methods may miss these critical mechanisms. This insight can guide efforts in debugging or influencing model behavior.

Key insights

Transformers can make early, irrevocable decisions sustained by specific attention heads, a phenomenon termed prolepsis.

Principles

Method

The study replicates planning-site findings and uses Causal Tracing (CLTs) to investigate early commitment and attention head routing in small transformers like Gemma 2 2B and Llama 3.2 1B.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.