quote
active
quote:causal-abstraction-implicitly-relies-on-strong-assumptions-about-how-features-are-encoded-in-deep-neural-networks-dnns-and-becomes-trivial-without-such-assumptionscausal abstraction implicitly relies on strong assumptions about how features are encoded in deep neural networks (DNNs), and becomes trivial without such assumptions
Load-bearing formulation of the paper's central argument
Source paper
extracted_from(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago
Neighborhood — ranked by edge-count
Claims (1)
claim
- Central thesis of the paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Authors' interpretation connecting their proof to practical interpretability methodology
- Circular dependency problem raised in discussion
- Methodological claim about the scientific value of combining causal abstraction with representational geometry analysis
- Historical framing of how representation assumptions have evolved in causal interpretability
- What is the connection between information encoding assumptions and causal abstraction?question0.804Identified as exciting future work direction
- Motivated by the finding that lexical entailment decomposes into word identities.
- Central claim motivating DAS over prior methods.
- Vision statement in the conclusion.