claim
active
claim:causal-abstraction-implicitly-relies-on-strong-assumptions-about-feature-encoding-in-dnns-and-becomes-trivial-without-such-assumptions

Causal abstraction implicitly relies on strong assumptions about feature encoding in DNNs, and becomes trivial without such assumptions

Authors' interpretation connecting their proof to practical interpretability methodology

Source paper

extracted_from
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.