finding
active
finding:gru-behavior-can-be-compressed-to-as-few-as-4-dimensions-using-das-and-mas-with-comparable-iiasGRU behavior can be compressed to as few as 4 dimensions using DAS and MAS with comparable IIAs
Shows that behaviorally relevant information is low-dimensional; contrasted with model stitching achieving near-perfect IIA at rank 2.
Neighborhood — ranked by edge-count
Claims (1)
claim
- Core interpretive claim supported by the formal analysis showing MAS does not exploit the behavioral null space unlike stitching.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
- Case study showing MAS can compare specific causal information types across models trained on different tasks.
- Demonstrates RSA's sensitivity issue in embedding layers; attributed partly to Spearman rank handling of RDMs with differing relative extrema.
- Shows MAS can compare specific numeric variables across tasks with different domains/codomains.
- DAS behavioral loss produces EMD along feature dimensions of 0.032±0.003 on synthetic 10-class datasetfinding0.742Quantitative baseline for divergence using behavioral DAS loss on synthetic dataset
- Prior work shows transformers use anti-Markovian solutions; MAS correctly shows low IIA reflecting this, while RSA/CKA do not detect it.
- Validates MAS as a causal detector of representational differences invisible to correlative methods.
- GRUs trained on the Arithmetic task use different types of numeric representations than incremental counting modelshypothesis0.731Interpretive hypothesis supported by the lower IIA between Count and Cumu Val variables even in the restricted value range.