finding

active

finding:inceptionv1-spreads-car-feature-from-a-pure-car-detector-in-mixed4c-across-dog-detector-neurons-in-the-next-layer

InceptionV1 spreads car feature from a pure car detector in mixed4c across dog detector neurons in the next layer

Circuit-level evidence that polysemantic neurons arise deliberately through superposition rather than entangled computation

Source paper

extracted_from

Zoom In: An Introduction to Circuits

(2020) · Chris Olah · Nick Cammarata · Ludwig Schubert · Gabriel Goh +2

Neighborhood — ranked by edge-count

Claims (1)

claim

Superposition is in some sense deliberate: the model converts pure neurons into polysemantic neurons to store more features in fewer neurons.
supports
Interpretation of the cars-in-superposition circuit finding as an intentional representational strategy

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

InceptionV1 implements a four-layer circuit for pose-invariant dog head detection with mirrored left/right pathways that inhibit each other then unite, exhibiting XOR-like propertiesfinding0.825
Evidence that neural networks learn sophisticated invariance mechanisms through structured circuits rather than loose feature aggregation
InceptionV1 neuron 4e:55 responds to cat faces, fronts of cars, and cat legs as unrelated stimulifinding0.806
Concrete example of polysemantic neuron demonstrating the challenge to the circuits agenda
Weights between early and full curve detectors in InceptionV1 form a curve of positive weights at tangent positions, with opposing orientations inhibitoryfinding0.783
Demonstrates that meaningful algorithms can be read directly off floating-point weights in a neural network
Curve detectors found across AlexNet, InceptionV1, VGG19, ResNetV2-50 and models trained on Places365finding0.766
Anecdotal evidence for the universality of low-level visual features across different architectures and datasets
High-low frequency detectors found across AlexNet, InceptionV1, VGG19, and ResNetV2-50finding0.739
Second low-level feature type demonstrating cross-architecture universality
Lindsey 2025: frontier models can detect and report changes in their own internal activations via concept injection experiments, demonstrating functional introspective awarenessfinding0.734
Prior finding cited as convergent evidence for LLM self-awareness capacities
The likelihood of a dedicated feature for a concept (element, city, animal, food) follows a sigmoid in log-frequency of the concept in training data, with threshold frequency inversely proportional to number of alive features.finding0.726
Quantitative relationship between concept frequency and feature presence.
Target morphology shifts occur despite the fact that all of the individual cells have unaltered normal genomes, showing that competent subunits can be pushed to implement diverse organism-scale goals by physiological signals.claim0.721
Highlights the non-genetic control of large-scale anatomy.