claim
active
claim:induction-heads-work-by-using-k-composition-with-a-previous-token-head-to-shift-keys-by-one-token-then-matching-the-current-destination-token-against-shifted-keys-to-predict-what-followsInduction heads work by using K-composition with a previous token head to shift keys by one token, then matching the current destination token against shifted keys to predict what follows
The mechanistic explanation of how induction heads are implemented in two-layer models
Neighborhood — ranked by edge-count
Findings (2)
finding
- All induction heads in the two-layer model occupy an extreme corner of high positive QK and OV eigenvalue positivity space relative to non-induction headsassociated_withsupportsQuantitative verification of the mechanistic theory; both circuits required for the induction algorithm show the predicted copying/matching structure
- Strong test of the induction head hypothesis using uniformly sampled random tokens repeated three times
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Forward-looking claim connecting toy model findings to large-scale language models
- GPT-2 implements at least one induction head using pointer arithmetic on positional embeddings rather than K-compositionhypothesis0.807Observation of an alternative induction head implementation algorithm in larger models with positional embeddings in the residual stream
- Central empirical claim of the paper; induction heads are shown to be the mechanism for powerful in-context learning
- A pair of query and key subcomponents distributed across attention heads performs previous-token behaviorfinding0.778VPD recovers an attention algorithm for attending to the previous token, distributed across multiple heads.
- The Primer architecture's depthwise convolution change would allow induction heads to form without requiring K-compositionhypothesis0.774Architectural interpretation of how Primer's design change relates to the paper's mechanistic theory of induction heads
- Quantitative verification that the copying and matching structure predicted by the mechanistic theory is present in all observed induction heads
- A follow-up paper extending the framework and induction head concept to larger more realistic models
- Interesting special case of copying behavior related to tokenization artifacts; primitive precursor to induction heads