Attention heads

Transformer attention heads that could be recruited to extract different kinds of information (text vs. thoughts).

Neighborhood — ranked by edge-count

paper

finding

Attention computations distribute across heads via parameter subcomponents with interpretable roles
cites
Mechanistic discovery about how attention mechanisms decompose into interpretable parameter components.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

attention head localization analysismethod0.839
Analysis measuring whether each attention head's maximum attention increase points to the correct injected sentence
Talking Heads Attentionconcept0.838
A transformer variant where OV and QK matrices of different attention heads can share components, enabling shared copying mechanisms
Virtual Attention Headconcept0.828
The composition of two attention heads via V-composition, forming a new entity with its own attention pattern A^h2 * A^h1 and OV matrix W_OV^h2 * W_OV^h1
Self-attentionconcept0.809
A form of key-query attention within a single input sequence; core to Transformers.
attention computationconcept0.772
Process using Q, K, V to compute a heat map over K and weighted sum of V.
Attention Schemaconcept0.765
A predictive model representing and controlling attention; central to attention schema theory.
attention mechanismconcept0.757
Core operation in transformers, computing weighted combinations of previous elements
Attention heads can be understood as independent operations each adding their output to the residual stream, equivalent to the concatenate-and-multiply formulationclaim0.757
Mathematical equivalence enabling independent analysis of each attention head