Circuits Framework

Mechanistic interpretability framework for understanding neural network computation as circuits of features

Neighborhood — ranked by edge-count

concept

Finite State Automata Feature Assemblies
associated_with
Collections of features that interact via the token stream — one feature increases probability of tokens that activate the next feature — forming FSA-like systems

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Circuits Threadframework0.837
An open scientific collaboration hosted on Distill slack studying the inner workings of neural networks via zoomed-in mechanistic investigation
Frameworkconcept0.819
1984 Ashton-Tate integrated system with frames, FRED language, and overlapping windows; design reference for Playground's approach.
A Mathematical Framework for Transformer Circuitsframework0.813
Prior Anthropic paper enabling circuit-level analysis of attention-only transformers; motivates current MLP decomposition
A Mathematical Framework for Transformer Circuits (Elhage et al., 2021)concept0.787
Foundational mechanistic interpretability paper on transformer circuit analysis
Circuit Findingmethod0.786
Interpretability technique for identifying functional sub-circuits in neural networks, supported by pyvene
Circuit Motifconcept0.781
A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology
Conceptual Design Frameworkframework0.772
Jackson's central framework for understanding software design through concepts as building blocks, their purposes, operational principles, and dependence graphs
Framework Templateframework0.760