finding
active
finding:over-80-iia-achieved-using-complex-non-linear-alignment-maps-on-randomly-initialised-mlps-in-hierarchical-equality-taskOver 80% IIA achieved using complex non-linear alignment maps on randomly initialised MLPs in hierarchical equality task
Demonstrates that high IIA can be obtained even when model cannot solve the task
Source paper
extracted_from(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago
Neighborhood — ranked by edge-count
Claims (1)
claim
- Empirical support for vacuousness of unrestricted causal abstraction
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Key empirical result: non-linear maps overcome linear maps' failure in deeper layers
- Replicates Geiger et al. 2024b pattern of layer-dependent IIA degradation with linear maps
- Hypothesis raised in distributive law task analysis
- Linear alignment map ϕ_lin IIA tracks DNN accuracy during Pythia-410m training progression on IOI taskfinding0.811Suggests linear maps may be better correlated with genuine task implementation than non-linear maps
- Exception to the general trend; attributed to insufficient RevNet capacity rather than algorithm not being implemented
- Corroborating result on additional task confirming main paper findings
- Authors connect their finding to the prior probing literature debate
- Shows high IIA on random models depends on entity overlap; generalisation is essential for genuine interpretation