finding
active
finding:mas-iia-is-low-for-gru-hidden-states-vs-transformer-hidden-states-on-multi-object-task-consistent-with-anti-markovian-transformer-solutionMAS IIA is low for GRU hidden states vs Transformer hidden states on Multi-Object task, consistent with anti-Markovian transformer solution
Validates MAS as a causal detector of representational differences invisible to correlative methods.
Neighborhood — ranked by edge-count
Claims (1)
claim
- Prior finding from Grant et al. 2025 used to interpret low MAS IIA for GRU-Transformer hidden state comparisons.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
- Prior work shows transformers use anti-Markovian solutions; MAS correctly shows low IIA reflecting this, while RSA/CKA do not detect it.
- Shows MAS can compare specific numeric variables across tasks with different domains/codomains.
- Case study showing MAS can compare specific causal information types across models trained on different tasks.
- MAS reduces number of required alignment matrices for n-model comparison from n(n-1) or n^2 (stitching) to nfinding0.747Key computational efficiency advantage of MAS over traditional model stitching for multi-model comparisons.
- Evidence that model stitching can exploit the behavioral null space, making it less causally restrictive than MAS.
- Demonstrates the value of the CL auxiliary loss for recovering causal alignments when one model cannot be intervened upon.
- Qualifies the arithmetic alignment results; supports hypothesis that Arithmetic GRUs use different numeric representations than incremental counting.