A Three-Phase Factual Recall Circuit in Gemma-2B and Gemma-12B-IT

· Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, medium

Summary

BizzaroWorld, a mechanistic interpretability study, localized a three-phase factual recall circuit within the Gemma-2B and Gemma-12B-IT large language models. Using activation patching across 60 prompt pairs and 20 knowledge categories, the research identified distinct stages for factual knowledge processing. In Gemma-2B, Phase 1 (Storage) occurs in layers 0-14 at the entity token position, with the residual stream dominating. Phase 2 (Routing) involves distributed attention heads, moving the signal to the final prediction position, though no single head was solely responsible. Phase 3 (Readout) happens in layers 15-17 at the final token position, where the answer is retrieved. This circuit replicated in Gemma-12B-IT, with storage shifting to layers 0-27 and readout in final layers, demonstrating scalability. The study also highlighted tokenizer-induced dataset drift, where Gemma-12B-IT excluded three prompt pairs due to tokenization differences, impacting cross-model comparisons.

Key takeaway

For research scientists investigating LLM internal mechanisms, understanding the three-phase factual recall circuit in Gemma models is crucial. You should account for tokenizer-induced dataset drift when comparing models, pre-running prompt sets through all target architectures. This ensures valid cross-model mechanistic comparisons and informs targeted interventions when factual recall fails. Consider path patching for more precise causal relationship mapping.

Key insights

Gemma models process factual recall via a consistent three-phase circuit: storage, distributed routing, and final readout.

Principles

Method

The study used activation patching with logit differences between clean/corrupt prompt pairs, measured by a "TotalSwing" metric, to isolate components across layers and sublayers in Gemma-2B and Gemma-12B-IT.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.