finding
active
finding:linear-alignment-map-lin-shows-substantial-iia-decrease-in-third-layer-for-both-equality-relations-and-left-equality-relation-algorithms-in-hierarchical-equality-taskLinear alignment map ϕ_lin shows substantial IIA decrease in third layer for both equality relations and left equality relation algorithms in hierarchical equality task
Replicates Geiger et al. 2024b pattern of layer-dependent IIA degradation with linear maps
Source paper
extracted_from(2025) · Sutter, Denis · Minder, Julian · Hofmann, Thomas · Pimentel, Tiago
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Key empirical result: non-linear maps overcome linear maps' failure in deeper layers
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates that high IIA can be obtained even when model cannot solve the task
- Linear alignment map ϕ_lin IIA tracks DNN accuracy during Pythia-410m training progression on IOI taskfinding0.813Suggests linear maps may be better correlated with genuine task implementation than non-linear maps
- Exception to the general trend; attributed to insufficient RevNet capacity rather than algorithm not being implemented
- Best localist alignment achieves IIA of 0.73 on hierarchical equality Both Equality Relations in Layer 1finding0.804Shows localist alignment fails to capture the distributed structure found by DAS.
- Alignment map ϕ(h)=W_orth*h using orthogonal matrix; assumes linear representation hypothesis
- Authors connect their finding to the prior probing literature debate
- Corroborating result on additional task confirming main paper findings
- Brute-force search achieves best IIA of 0.60 on hierarchical equality Both Equality Relations in Layer 1finding0.766DAS substantially outperforms brute-force search (1.00 vs 0.60 IIA) on the hierarchical equality task.