finding

active

finding:model-stitching-achieves-nearly-perfect-iia-even-for-rank-2-transformation-matrices-on-multi-object-gru-models

Model stitching achieves nearly perfect IIA even for rank-2 transformation matrices on Multi-Object GRU models

Evidence that model stitching can exploit the behavioral null space, making it less causally restrictive than MAS.

Source paper

extracted_from

Model Alignment Search

(2025) · Satchel Grant

Neighborhood — ranked by edge-count

Claims (1)

claim

Model stitching can use the behavioral null space of the source model when mapping to the target, making successful stitching insufficient evidence of representational similarity
associated_withsupports
Formal analysis showing the theoretical limitation of model stitching as a similarity measure.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

MAS successfully aligns behavior between Multi-Object GRU models in both embedding and hidden state layers with high IIAfinding0.785
Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
MAS reduces number of required alignment matrices for n-model comparison from n(n-1) or n^2 (stitching) to nfinding0.759
Key computational efficiency advantage of MAS over traditional model stitching for multi-model comparisons.
MAS IIA is low for GRU hidden states vs Transformer hidden states on Multi-Object task, consistent with anti-Markovian transformer solutionfinding0.746
Validates MAS as a causal detector of representational differences invisible to correlative methods.
Zero-shot model stitching without learning a stitching layer is feasible across different text models trained on different modalitiesfinding0.736
Moschella et al. result cited as evidence of representational convergence across models
Zero-shot model stitching without a learned stitching layer is feasible because different text models embed data in remarkably similar waysclaim0.736
Strong evidence for representational alignment across models
Near-perfect IIA can be achieved on randomly initialised models that cannot solve the task, suggesting causal alignment does not require task capabilityclaim0.732
Empirical support for vacuousness of unrestricted causal abstraction
MAS is a more causally focused choice than model stitching for addressing questions of how behaviorally relevant information is encoded in different neural systemsclaim0.727
Core interpretive claim supported by the formal analysis showing MAS does not exploit the behavioral null space unlike stitching.
CKA and RSA show potentially unintuitive (over-estimated) hidden state similarity for GRU-Transformer comparisons on Multi-Object taskfinding0.722
Prior work shows transformers use anti-Markovian solutions; MAS correctly shows low IIA reflecting this, while RSA/CKA do not detect it.