claim
active
claim:attention-is-a-generalization-of-convolution-all-convolutions-can-be-expressed-as-tensor-products-of-fixed-relative-position-attention-patterns-and-weight-matrices

Attention is a generalization of convolution; all convolutions can be expressed as tensor products of fixed relative position attention patterns and weight matrices

Mathematical equivalence showing the relationship between attention mechanisms and convolutional operations

Source paper

extracted_from
A Mathematical Framework for Transformer Circuits
(2021) ·

Neighborhood — ranked by edge-count

Thinkers (1)

thinker
  • Explored the equivalence between attention and convolution and empirically found that vision models often have many 2D relative position heads

Concepts (1)

concept
  • The composition of two attention heads via V-composition, forming a new entity with its own attention pattern A^h2 * A^h1 and OV matrix W_OV^h2 * W_OV^h1

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.