finding
active
finding:all-induction-heads-in-the-two-layer-model-occupy-an-extreme-corner-of-high-positive-qk-and-ov-eigenvalue-positivity-space-relative-to-non-induction-headsAll induction heads in the two-layer model occupy an extreme corner of high positive QK and OV eigenvalue positivity space relative to non-induction heads
Quantitative verification of the mechanistic theory; both circuits required for the induction algorithm show the predicted copying/matching structure
Neighborhood — ranked by edge-count
Claims (2)
claim
- Induction heads work by using K-composition with a previous token head to shift keys by one token, then matching the current destination token against shifted keys to predict what followsassociated_withsupportsThe mechanistic explanation of how induction heads are implemented in two-layer models
- Central empirical claim of the paper; induction heads are shown to be the mechanism for powerful in-context learning
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Quantitative verification that the copying and matching structure predicted by the mechanistic theory is present in all observed induction heads
- Quantitative result from eigenvalue analysis of expanded OV matrices; confirmed by qualitative inspection
- Structural finding about which attention heads control reflection behavior
- Result from term importance analysis breaking down loss contribution by layer
- Result from applying the Frobenius norm composition measurement to all attention head pairs in the two-layer model
- Strong test of the induction head hypothesis using uniformly sampled random tokens repeated three times
- Visual geometric evidence for the fundamental entanglement of true/false activations in harder tasks.
- Empirical observation from examining expanded OV/QK matrices; approximately 10 out of 12 heads show significant copying