claim
active
claim:das-overcomes-the-localist-limitation-of-prior-causal-abstraction-by-allowing-individual-neurons-to-play-multiple-roles-via-non-standard-basesDAS overcomes the localist limitation of prior causal abstraction by allowing individual neurons to play multiple roles via non-standard bases
Central claim motivating DAS over prior methods.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (4)
finding
- DAS achieves 100% IIA on hierarchical equality task with |N|=16, intervention size 8, Layer 1supportsDAS discovers a perfect alignment between the feed-forward network and the Both Equality Relations high-level model.
- Perfect abstraction relation between BERT and symbolic algorithm with negation and lexical entailment variables.
- Shows localist alignment fails to capture the distributed structure found by DAS.
- Localist methods fail entirely on MoNLI distributed representations.
Questions (1)
question
- Framing question for the paper's research program.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Replication of Wu et al. 2023 finding; DAS expressivity concern validated in CausalGym setup
- Fundamental theoretical claim motivating DAS, attributed to Smolensky/Rumelhart/McClelland.
- Historical framing of how representation assumptions have evolved in causal interpretability
- Concluding claim about theoretical significance of the hierarchical equality finding.
- Load-bearing theoretical claim providing the conceptual foundation for DAS.
- Load-bearing formulation of the paper's central argument
- Explains why time and sequence are essential for generated complexity.
- Authors' interpretation connecting their proof to practical interpretability methodology