hypothesis
active
hypothesis:superposition-hypothesis-neural-networks-represent-more-features-than-dimensions-using-almost-orthogonal-directionsSuperposition hypothesis: neural networks represent more features than dimensions using almost-orthogonal directions.
Explanation for why dictionary learning can recover many more features than dimensions.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Linear representation hypothesis: neural networks represent meaningful concepts as directions in their activation spaces.hypothesis0.867Foundation for interpreting features as linear directions.
- Interpretation of the cars-in-superposition circuit finding as an intentional representational strategy
- Speculative extension of universality to neuroscience, with high-low frequency detectors as a candidate prediction
- Load-bearing theoretical claim providing the conceptual foundation for DAS.
- Mechanistic explanation for why superposition is geometrically feasible
- The paper's central thesis statement, presented prominently after the abstract
- Extends convergence argument to brain-machine alignment
- Theoretical model of how neural networks encode more features than dimensions, informing linear representation work.