finding

active

finding:cka-and-rsa-show-potentially-unintuitive-over-estimated-hidden-state-similarity-for-gru-transformer-comparisons-on-multi-object-task

CKA and RSA show potentially unintuitive (over-estimated) hidden state similarity for GRU-Transformer comparisons on Multi-Object task

Prior work shows transformers use anti-Markovian solutions; MAS correctly shows low IIA reflecting this, while RSA/CKA do not detect it.

Source paper

extracted_from

Model Alignment Search

(2025) · Satchel Grant

Neighborhood — ranked by edge-count

Claims (1)

claim

Correlative methods like RSA and CKA are insufficient for determining functional similarity between neural systems; causal methods are necessary
associated_withsupports
Central motivating claim of the paper; supported by empirical comparisons showing RSA/CKA miss Markovian differences detectable by MAS.

Frameworks (1)

framework

Representational Similarity Analysis (RSA)
associated_with
A correlational similarity method compared against MAS; uses RDM correlations between model representations.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

MAS IIA is low for GRU hidden states vs Transformer hidden states on Multi-Object task, consistent with anti-Markovian transformer solutionfinding0.836
Validates MAS as a causal detector of representational differences invisible to correlative methods.
RSA shows low RDM correlation on embedding layers for GRU-GRU comparisons, despite high within-seed functional similarityfinding0.817
Demonstrates RSA's sensitivity issue in embedding layers; attributed partly to Spearman rank handling of RDMs with differing relative extrema.
MAS successfully aligns behavior between Multi-Object GRU models in both embedding and hidden state layers with high IIAfinding0.796
Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
CKA shows a very weak trend of alignment between models even within modality, compared to mutual k-NN which shows stronger trendsfinding0.781
Explains why mutual k-NN was chosen over CKA as primary metric
Transformers are recurrent through autoregression because K/V stream provides horizontal information flow across positions.claim0.749
Claim formalizing the Anima Labs idea that transformers are effectively recurrent due to K/V stream.
Larger hidden representations create more random structure that DAS can search through, allowing manipulation of counterfactual behavior even in randomly initialized networkshypothesis0.749
Tested in Section 4.4 calibration experiment; confirmed by findings.
Spearman's rank correlation among different alignment metrics (CKA, SVCCA, Mutual k-NN, CKNNA) over 78 vision models is high across variants, with all p-values below 2.24×10^-105finding0.744
Validates robustness of alignment metric choice
Different introspective tasks may preferentially use different path distributions in the transformer.claim0.744
Interpretive claim connecting exponential path combinatorics to Lindsey's layer-dependent findings.