claim
active
claim:features-are-connected-by-weights-forming-circuits-and-these-circuits-can-be-rigorously-studied-and-understood-as-meaningful-algorithmsFeatures are connected by weights forming circuits, and these circuits can be rigorously studied and understood as meaningful algorithms.
Second of three speculative claims asserting that subgraphs of neural networks are tractable and meaningful objects of study
Source paper
extracted_from(2020) · Chris Olah · Nick Cammarata · Ludwig Schubert · Gabriel Goh +2
Neighborhood — ranked by edge-count
Papers (1)
paper
- Zoom In: An Introduction to Circuitsintroduces
Findings (2)
finding
- Demonstrates that meaningful algorithms can be read directly off floating-point weights in a neural network
- Evidence that neural networks learn sophisticated invariance mechanisms through structured circuits rather than loose feature aggregation
Claims (3)
claim
- Third of three speculative claims asserting that learned features are not model-specific but represent universal solutions to learning problems
- First of three speculative claims forming the foundation of the circuits research agenda
- Argument that circuits methodology meets natural-science standards of falsifiability
Questions (1)
question
- Identified gap linking polysemanticity challenge to disentangled representations literature
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Decoder cosine similarity maps onto concept similarity.
- General principle supported tangentially by covariance pooling work; relates to feature co-occurrence structure.
- Cited as enabling precise behavioral control through SAE features, extending the same methodological line
- Quantitative relationship between concept frequency and feature presence.
- Interpretive claim that circuits render raw weights interpretable as algorithmic structures
- Load-bearing claim about the tractability of circuit analysis; central thesis of Claim 2
- Vision of the emerging paradigm shift in society.
- Architectural requirement from machine learning.