finding
active
finding:localist-alignment-achieves-0-51-iia-on-monli-tasks-near-chance-performanceLocalist alignment achieves ~0.51 IIA on MoNLI tasks, near chance performance
Localist methods fail entirely on MoNLI distributed representations.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Neighborhood — ranked by edge-count
Claims (1)
claim
- Central claim motivating DAS over prior methods.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Best localist alignment achieves IIA of 0.73 on hierarchical equality Both Equality Relations in Layer 1finding0.833Shows localist alignment fails to capture the distributed structure found by DAS.
- Baseline that finds the axis-aligned orthogonal matrix closest to the learned distributed rotation, assuming disjoint neuron groups.
- Demonstrates that high IIA can be obtained even when model cannot solve the task
- DAS substantially outperforms brute-force search on MoNLI across all models.
- Empirical support for vacuousness of unrestricted causal abstraction
- Open question the authors leave unresolved about interpreting the magnitude of their alignment measurements
- Quantitative bound on observed alignment; raises the open question of whether this gap reflects noise or real misalignment
- Algorithm that extracts a localist (axis-aligned) approximation from any learned orthogonal rotation matrix for baseline comparison.