concept
active
concept:induction-heads

Induction Heads

Mechanistic circuits in transformers documented by Olsson et al. 2022, cited as evidence for pattern-repository assumption

Neighborhood — ranked by edge-count

Thinkers (1)

thinker
  • Chris Olah
    introduces
    Co-author; provided high-level research guidance, wrote introduction/discussion.

Concepts (6)

concept
  • in-context learning (ICL)
    associated_withimplements
    Test-time adaptation from prompt or retrieved context with no parameter updates.
  • Previous Token Head
    associated_withimplements
    An attention head that primarily attends to the immediately preceding token; key building block for induction heads via K-composition
  • A three-token pattern of the form [source]...[destination][out] that one-layer attention heads implement; the paper's key characterization of one-layer transformer behavior
  • K-Composition
    implements
    A form of attention head composition where W_K reads from a subspace affected by a previous head; central to how induction heads are implemented
  • Unlabeled statistical regularities stored during pretraining.
  • The primary model analyzed; uses attention head composition, especially K-composition, to create induction heads for powerful in-context learning

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.