framework
active
framework:circuits-framework

Circuits Framework

Mechanistic interpretability framework for understanding neural network computation as circuits of features

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Collections of features that interact via the token stream — one feature increases probability of tokens that activate the next feature — forming FSA-like systems

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Circuits Threadframework0.837
    An open scientific collaboration hosted on Distill slack studying the inner workings of neural networks via zoomed-in mechanistic investigation
  • Frameworkconcept0.819
    1984 Ashton-Tate integrated system with frames, FRED language, and overlapping windows; design reference for Playground's approach.
  • Prior Anthropic paper enabling circuit-level analysis of attention-only transformers; motivates current MLP decomposition
  • Foundational mechanistic interpretability paper on transformer circuit analysis
  • Circuit Findingmethod0.786
    Interpretability technique for identifying functional sub-circuits in neural networks, supported by pyvene
  • Circuit Motifconcept0.781
    A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology
  • Jackson's central framework for understanding software design through concepts as building blocks, their purposes, operational principles, and dependence graphs
  • Framework Templateframework0.760