finding

active

finding:mas-iia-is-low-for-gru-hidden-states-vs-transformer-hidden-states-on-multi-object-task-consistent-with-anti-markovian-transformer-solution

MAS IIA is low for GRU hidden states vs Transformer hidden states on Multi-Object task, consistent with anti-Markovian transformer solution

Validates MAS as a causal detector of representational differences invisible to correlative methods.

Source paper

extracted_from

Model Alignment Search

(2025) · Satchel Grant

Neighborhood — ranked by edge-count

Claims (1)

claim

Transformers use an anti-Markovian solution that recomputes relevant numeric information at each step in the Multi-Object task
supports
Prior finding from Grant et al. 2025 used to interpret low MAS IIA for GRU-Transformer hidden state comparisons.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

MAS successfully aligns behavior between Multi-Object GRU models in both embedding and hidden state layers with high IIAfinding0.874
Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
CKA and RSA show potentially unintuitive (over-estimated) hidden state similarity for GRU-Transformer comparisons on Multi-Object taskfinding0.836
Prior work shows transformers use anti-Markovian solutions; MAS correctly shows low IIA reflecting this, while RSA/CKA do not detect it.
MAS successfully aligns the Count variable from Multi-Object GRUs with the Rem Ops variable from Arithmetic GRUs with moderate IIAfinding0.768
Shows MAS can compare specific numeric variables across tasks with different domains/codomains.
MAS reveals that numeric representations differ between GRUs trained on Multi-Object, Rounding, and Modulo tasksfinding0.763
Case study showing MAS can compare specific causal information types across models trained on different tasks.
MAS reduces number of required alignment matrices for n-model comparison from n(n-1) or n^2 (stitching) to nfinding0.747
Key computational efficiency advantage of MAS over traditional model stitching for multi-model comparisons.
Model stitching achieves nearly perfect IIA even for rank-2 transformation matrices on Multi-Object GRU modelsfinding0.746
Evidence that model stitching can exploit the behavioral null space, making it less causally restrictive than MAS.
CLMAS achieves the best IIA in the causally inaccessible (No Access) direction while matching MAS in the accessible directionfinding0.745
Demonstrates the value of the CL auxiliary loss for recovering causal alignments when one model cannot be intervened upon.
MAS IIA for Count vs Low CumuVal (values 1-10) is higher than Count vs full CumuVal, but still lower than Count vs Rem Opsfinding0.742
Qualifies the arithmetic alignment results; supports hypothesis that Arithmetic GRUs use different numeric representations than incremental counting.