quote

active

quote:smolensky-1986-proposes-that-viewing-a-neural-representation-under-a-basis-that-is-not-aligned-with-individual-neurons-can-reveal-the-interpretable-distributed-structure-of-the-neural-representations

Smolensky (1986) proposes that viewing a neural representation under a basis that is not aligned with individual neurons can reveal the interpretable distributed structure of the neural representations.

Load-bearing theoretical claim providing the conceptual foundation for DAS.

Source paper

extracted_from

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1

Neighborhood — ranked by edge-count

Concepts (1)

concept

Distributed Neural Representations
supports
Representations where individual neurons play multiple conceptual roles; patterns consisting of linear combinations of unit vectors.

Methods (1)

method

Distributed Alignment Search
supports
The core method introduced in this paper: finds alignments between high-level causal variables and distributed neural representations via gradient descent.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Linear representation hypothesis: neural networks represent meaningful concepts as directions in their activation spaces.hypothesis0.817
Foundation for interpreting features as linear directions.
Superposition hypothesis: neural networks represent more features than dimensions using almost-orthogonal directions.hypothesis0.816
Explanation for why dictionary learning can recover many more features than dimensions.
Neural Representations of Location Composed of Spatially Periodic Bands (Krupic et al., 2012)concept0.804
Discovery of band cells; TEM-t also recapitulates these representations.
Neural networks show substantial alignment with biological representations in the brain, driven by shared task and data constraintsclaim0.803
Extends convergence argument to brain-machine alignment
Neurons can correspond to interpretable functional roles but interpretations in terms of individual neurons are unlikely to be the most parsimoniousclaim0.798
Claim from footnote 3, acknowledging neuron-level interpretability while arguing subcomponents are better.
Investigating the causal substructure of neural representations is necessary to avoid misidentifying data structures of simpler representations as abstract conceptsclaim0.796
Motivated by the finding that lexical entailment decomposes into word identities.
Neuroscience and mechanistic interpretability have not yet made enough progress to identify neural correlates marking necessary and sufficient conditions of conscious experience in both brains and neural networks.claim0.795
Paper explicitly identifies this as a current gap requiring alternative experimental approaches
Neural representations carry rich geometric structure; but does that structure causally shape behavior?quote0.794
Opening sentence framing the paper's core inquiry.