hypothesis

active

hypothesis:we-hypothesize-that-polysemantic-neurons-may-be-resolvable-by-unfolding-networks-or-training-to-avoid-polysemanticity

We hypothesize that polysemantic neurons may be resolvable by unfolding networks or training to avoid polysemanticity.

Forward-looking proposal for how the polysemanticity challenge to circuits research might be overcome

Source paper

extracted_from

Zoom In: An Introduction to Circuits

(2020) · Chris Olah · Nick Cammarata · Ludwig Schubert · Gabriel Goh +2

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Polysemantic neurons are a major challenge for the circuits agenda, because N meanings in one neuron times M in another creates NxM effective connections that cannot be considered individually.claim0.829
Precise characterization of why polysemanticity poses a combinatorial obstacle to circuit analysis
No established method for resolving polysemantic neurons into pure features at scalequestion0.827
Identified gap linking polysemanticity challenge to disentangled representations literature
Training models with sparse activations cannot fully prevent polysemanticity because cross-entropy loss creates incentives for polysemantic neurons even without superpositionclaim0.807
Author's conclusion after extensive investigation of architectural approaches to monosemanticity
Polysemantic Neuronconcept0.793
A neuron that responds to multiple unrelated inputs, posing a major challenge for circuit-level interpretation
Models with 1-hot activation sparsity still have polysemantic neurons; single neuron trained on 4 mutually exclusive features prefers polysemantic representation with loss ~0.7 vs 0.8finding0.786
Counter-example disproving that architectural sparsity alone can prevent polysemanticity
Superposition is in some sense deliberate: the model converts pure neurons into polysemantic neurons to store more features in fewer neurons.claim0.780
Interpretation of the cars-in-superposition circuit finding as an intentional representational strategy
Learning neural networks can enable 'chunking' and rescale problem-solving to higher organizational levels, a mechanism intrinsic to transitions in individuality.hypothesis0.761
The remarkable ability of neurons to unify toward a centralized self is an evolutionary pivot of far earlier cell communication strategies that first solved problems in navigating anatomical morphospace.hypothesis0.760
Proposes an evolutionary trajectory linking morphogenesis to neural cognition.