book

active

book:an-introduction-to-systems-biology-design-principles-of-biological-circuits

An Introduction to Systems Biology: Design Principles of Biological Circuits

Alon's systems biology text that provides the concept of circuit motifs adopted by the circuits agenda

Extracted from this book

Claims (14)

Analogous features and circuits form across models and tasks.
Third of three speculative claims asserting that learned features are not model-specific but represent universal solutions to learning problems
Circuit claims are falsifiable: if you understand a circuit, you should be able to predict what changes when you edit the weights.
Argument that circuits methodology meets natural-science standards of falsifiability
Circuits could act as an epistemic foundation for interpretability by breaking down model behavior into falsifiable statements about small subgraphs.
Normative vision for how the circuits agenda could resolve the pre-paradigmatic state of interpretability
Features are connected by weights forming circuits, and these circuits can be rigorously studied and understood as meaningful algorithms.
Second of three speculative claims asserting that subgraphs of neural networks are tractable and meaningful objects of study
Features are the fundamental unit of neural networks; they correspond to directions and can be rigorously studied and understood.
First of three speculative claims forming the foundation of the circuits research agenda
If the universality hypothesis is broadly true, it raises the exciting possibility that artificial neural networks could predict features previously unknown in biological neural networks.
Speculative extension of universality to neuroscience, with high-low frequency detectors as a candidate prediction
In the long run, studying circuit motifs may be more important than studying individual circuits for understanding neural networks.
Strategic claim about the relative importance of motif-level abstraction over circuit-level analysis
Individual floating-point number weights in neural networks become meaningful once you understand the features they connect.
Interpretive claim that circuits render raw weights interpretable as algorithmic structures
Interpretability today is a pre-paradigmatic field lacking consensus on objects of study, methods, and evaluative standards.
Diagnosis of the state of the interpretability field, drawing on Kuhn's framework
Polysemantic neurons are a major challenge for the circuits agenda, because N meanings in one neuron times M in another creates NxM effective connections that cannot be considered individually.
Precise characterization of why polysemanticity poses a combinatorial obstacle to circuit analysis
Qualitative research results can change the world: the discovery of cells was qualitative, just as interpretability research is today.
Historical argument defending qualitative interpretability research against dismissal as unscientific
Superposition exploits the geometry of high-dimensional spaces, which allow exponentially many almost-orthogonal vectors but only n strictly orthogonal ones.
Mechanistic explanation for why superposition is geometrically feasible
Superposition is in some sense deliberate: the model converts pure neurons into polysemantic neurons to store more features in fewer neurons.
Interpretation of the cars-in-superposition circuit finding as an intentional representational strategy
The typical case is that neurons (or other directions in activation space) are understandable after thousands of hours of study, even when initially mysterious.
Author's interpretive assertion based on extensive empirical investigation, countering texture-only skepticism

Findings (7)

Curve detecting neurons found in every non-trivial vision model carefully examined
Empirical basis for treating curve detectors as a canonical example of meaningful, understandable features
Curve detectors found across AlexNet, InceptionV1, VGG19, ResNetV2-50 and models trained on Places365
Anecdotal evidence for the universality of low-level visual features across different architectures and datasets
High-low frequency detectors found across AlexNet, InceptionV1, VGG19, and ResNetV2-50
Second low-level feature type demonstrating cross-architecture universality
InceptionV1 implements a four-layer circuit for pose-invariant dog head detection with mirrored left/right pathways that inhibit each other then unite, exhibiting XOR-like properties
Evidence that neural networks learn sophisticated invariance mechanisms through structured circuits rather than loose feature aggregation
InceptionV1 neuron 4e:55 responds to cat faces, fronts of cars, and cat legs as unrelated stimuli
Concrete example of polysemantic neuron demonstrating the challenge to the circuits agenda
InceptionV1 spreads car feature from a pure car detector in mixed4c across dog detector neurons in the next layer
Circuit-level evidence that polysemantic neurons arise deliberately through superposition rather than entangled computation
Weights between early and full curve detectors in InceptionV1 form a curve of positive weights at tangent positions, with opposing orientations inhibitory
Demonstrates that meaningful algorithms can be read directly off floating-point weights in a neural network

Hypotheses (2)

We hypothesize that high-low frequency detectors, if predicted by artificial neural network universality, might be found in biological neural networks.
Specific cross-domain prediction mentioned by neuroscientists in conversation with the authors
We hypothesize that polysemantic neurons may be resolvable by unfolding networks or training to avoid polysemanticity.
Forward-looking proposal for how the polysemanticity challenge to circuits research might be overcome

Neighborhood — ranked by edge-count

Concepts (1)

concept

Circuit Motif
cites
A recurring, abstract pattern found in circuits (e.g., equivariance, unioning over cases), inspired by circuit motifs in systems biology