framework
active
framework:circuits-thread

Circuits Thread

An open scientific collaboration hosted on Distill slack studying the inner workings of neural networks via zoomed-in mechanistic investigation

Neighborhood — ranked by edge-count

Papers (1)

paper

Thinkers (1)

thinker
  • Chris Olah
    studies
    Co-author; provided high-level research guidance, wrote introduction/discussion.

Frameworks (1)

framework
  • Prior mechanistic interpretability work reverse-engineering vision models (InceptionV1); the direct predecessor this paper extends to language models

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Circuits Frameworkframework0.837
    Mechanistic interpretability framework for understanding neural network computation as circuits of features
  • Circuit Findingmethod0.801
    Interpretability technique for identifying functional sub-circuits in neural networks, supported by pyvene
  • Circuit Motifconcept0.778
    A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology
  • Circuit Analysisconcept0.777
    Fine-grained approach to identifying specific network components responsible for reflection, mentioned as future direction.
  • Sparse circuitsconcept0.748
    A goal in mechanistic interpretability to identify sparse computational subgraphs; VPD promotes sparse parameter circuits.
  • Interactive tool for visualizing and inspecting learned binary logic circuits using modified DigitalJS library
  • The key novel property of DiffLogic CA — logic gate networks that are recurrent both spatially and temporally
  • Reading a meaningful algorithm directly off of the weights linking neurons in a circuit