concept
active
concept:ov-circuit

OV Circuit

The circuit formed by W_U W_OV^h W_E that describes how a given token affects output logits if attended to

Neighborhood — ranked by edge-count

Methods (1)

method
  • A conceptual technique of fixing attention patterns to make the transformer a purely linear function of tokens, enabling independent analysis of OV and QK circuits

Concepts (1)

concept
  • Copying Matrix
    associated_with
    An OV circuit matrix that maps tokens to increasing the logit of those same tokens; detectable via positive eigenvalues

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • A circuit exhibiting symmetry such that the weights rotate with the orientation of the feature they detect
  • Circuit Findingmethod0.713
    Interpretability technique for identifying functional sub-circuits in neural networks, supported by pyvene
  • Circuits Threadframework0.704
    An open scientific collaboration hosted on Distill slack studying the inner workings of neural networks via zoomed-in mechanistic investigation
  • QK Circuitconcept0.703
    The bilinear circuit formed by W_E^T W_QK^h W_E that determines which source token a destination token attends to
  • The key novel property of DiffLogic CA — logic gate networks that are recurrent both spatially and temporally
  • Circuit Motifconcept0.696
    A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology
  • Advantage of DiffLogic CA over NCA — learned rules are pure binary logic circuits that can be visualized and analyzed
  • Hypothesis that discretization finds minimum-size circuits equivalent to minimal algorithmic descriptions of patterns