claim
active
claim:the-residual-stream-has-a-deeply-linear-structure-enabling-virtual-weights-and-path-expansion-analysis

The residual stream has a deeply linear structure, enabling virtual weights and path expansion analysis

Architectural observation enabling the entire mathematical framework; the residual stream is purely a sum of linear projections

Source paper

extracted_from
A Mathematical Framework for Transformer Circuits
(2021) ·

Neighborhood — ranked by edge-count

Concepts (2)

concept
  • The mathematical trick of expanding a product of layer terms into a sum of end-to-end path terms, enabling independent analysis of each term
  • Implicit weights directly connecting any pair of layers computed by multiplying output weights of one layer with input weights of another through the residual stream

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.