concept
active
concept:ov-circuitOV Circuit
The circuit formed by W_U W_OV^h W_E that describes how a given token affects output logits if attended to
Neighborhood — ranked by edge-count
Papers (1)
paper
Methods (1)
method
- A conceptual technique of fixing attention patterns to make the transformer a purely linear function of tokens, enabling independent analysis of OV and QK circuits
Concepts (1)
concept
- Copying Matrixassociated_withAn OV circuit matrix that maps tokens to increasing the logit of those same tokens; detectable via positive eigenvalues
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A circuit exhibiting symmetry such that the weights rotate with the orientation of the feature they detect
- Interpretability technique for identifying functional sub-circuits in neural networks, supported by pyvene
- An open scientific collaboration hosted on Distill slack studying the inner workings of neural networks via zoomed-in mechanistic investigation
- The bilinear circuit formed by W_E^T W_QK^h W_E that determines which source token a destination token attends to
- The key novel property of DiffLogic CA — logic gate networks that are recurrent both spatially and temporally
- A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology
- Advantage of DiffLogic CA over NCA — learned rules are pure binary logic circuits that can be visualized and analyzed
- Hypothesis that discretization finds minimum-size circuits equivalent to minimal algorithmic descriptions of patterns