claim
active
claim:the-effect-of-alignment-map-complexity-on-iia-in-causal-abstraction-is-an-analogue-of-the-probing-complexity-accuracy-trade-offThe effect of alignment map ϕ complexity on IIA in causal abstraction is an analogue of the probing complexity–accuracy trade-off
Authors connect their finding to the prior probing literature debate
Source paper
extracted_from(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Longstanding debate from probing literature about whether complex probes reveal genuine encodings or just memorise; this paper revives it for causal abstraction
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Empirical support for vacuousness of unrestricted causal abstraction
- Demonstrates that high IIA can be obtained even when model cannot solve the task
- Central thesis of the paper
- Key empirical result: non-linear maps overcome linear maps' failure in deeper layers
- Replicates Geiger et al. 2024b pattern of layer-dependent IIA degradation with linear maps
- Methodological claim about the scientific value of combining causal abstraction with representational geometry analysis
- Replication of Wu et al. 2023 finding; DAS expressivity concern validated in CausalGym setup
- Hypothesis raised in distributive law task analysis