hypothesis
active
hypothesis:linear-representation-hypothesis-neural-networks-represent-meaningful-concepts-as-directions-in-their-activation-spacesLinear representation hypothesis: neural networks represent meaningful concepts as directions in their activation spaces.
Foundation for interpreting features as linear directions.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Superposition hypothesis: neural networks represent more features than dimensions using almost-orthogonal directions.hypothesis0.867Explanation for why dictionary learning can recover many more features than dimensions.
- The paper's concluding summary statement asserting the deep interpretive significance of representation geometry.
- The paper's central thesis statement, presented prominently after the abstract
- Load-bearing theoretical claim providing the conceptual foundation for DAS.
- Extends convergence argument to brain-machine alignment
- Interpretive claim about what linear DAS results actually tell us
- Opening sentence framing the paper's core inquiry.
- Neural Representations of Location Composed of Spatially Periodic Bands (Krupic et al., 2012)concept0.812Discovery of band cells; TEM-t also recapitulates these representations.