claim
active
claim:causal-abstraction-implicitly-relies-on-strong-assumptions-about-feature-encoding-in-dnns-and-becomes-trivial-without-such-assumptionsCausal abstraction implicitly relies on strong assumptions about feature encoding in DNNs, and becomes trivial without such assumptions
Authors' interpretation connecting their proof to practical interpretability methodology
Source paper
extracted_from(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago
Neighborhood — ranked by edge-count
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Load-bearing formulation of the paper's central argument
- Circular dependency problem raised in discussion
- What is the connection between information encoding assumptions and causal abstraction?question0.840Identified as exciting future work direction
- Central thesis of the paper
- Methodological claim about the scientific value of combining causal abstraction with representational geometry analysis
- Historical framing of how representation assumptions have evolved in causal interpretability
- Replication of Wu et al. 2023 finding; DAS expressivity concern validated in CausalGym setup
- Cited as enabling precise behavioral control through SAE features, extending the same methodological line