hypothesis
active
hypothesis:we-hypothesize-that-polysemantic-neurons-may-be-resolvable-by-unfolding-networks-or-training-to-avoid-polysemanticityWe hypothesize that polysemantic neurons may be resolvable by unfolding networks or training to avoid polysemanticity.
Forward-looking proposal for how the polysemanticity challenge to circuits research might be overcome
Source paper
extracted_from(2020) · Chris Olah · Nick Cammarata · Ludwig Schubert · Gabriel Goh +2
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Precise characterization of why polysemanticity poses a combinatorial obstacle to circuit analysis
- Identified gap linking polysemanticity challenge to disentangled representations literature
- Author's conclusion after extensive investigation of architectural approaches to monosemanticity
- A neuron that responds to multiple unrelated inputs, posing a major challenge for circuit-level interpretation
- Counter-example disproving that architectural sparsity alone can prevent polysemanticity
- Interpretation of the cars-in-superposition circuit finding as an intentional representational strategy
- Proposes an evolutionary trajectory linking morphogenesis to neural cognition.