finding

active

finding:mas-reveals-that-numeric-representations-differ-between-grus-trained-on-multi-object-rounding-and-modulo-tasks

MAS reveals that numeric representations differ between GRUs trained on Multi-Object, Rounding, and Modulo tasks

Case study showing MAS can compare specific causal information types across models trained on different tasks.

Source paper

extracted_from

Model Alignment Search

(2025) · Satchel Grant

Neighborhood — ranked by edge-count

Hypotheses (1)

hypothesis

GRUs trained on the Arithmetic task use different types of numeric representations than incremental counting models
supports
Interpretive hypothesis supported by the lower IIA between Count and Cumu Val variables even in the restricted value range.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

MAS successfully aligns the Count variable from Multi-Object GRUs with the Rem Ops variable from Arithmetic GRUs with moderate IIAfinding0.835
Shows MAS can compare specific numeric variables across tasks with different domains/codomains.
MAS successfully aligns behavior between Multi-Object GRU models in both embedding and hidden state layers with high IIAfinding0.812
Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
There are fewer representations competent for N tasks than M<N tasks, so training more general models should yield fewer possible solutionshypothesis0.793
Selective pressure toward convergence via task generality
How do representations differ or converge between architectures, tasks, and modalities?question0.787
Broader research question MAS is positioned to address, citing multiple recent works.
GRU behavior can be compressed to as few as 4 dimensions using DAS and MAS with comparable IIAsfinding0.782
Shows that behaviorally relevant information is low-dimensional; contrasted with model stitching achieving near-perfect IIA at rank 2.
We hypothesize that degraded generalization on benchmarks like MMLU may reflect the computational demands of the tasks.hypothesis0.771
Connecting the paper's task-difficulty findings to prior observations of weak generalization on complex QA benchmarks.
There is a bidirectional relationship between the geometry of representation and behavior across tasks and modalities.claim0.768
Author’s interpretive claim that the shared geometry is general and robust.
MAS reduces number of required alignment matrices for n-model comparison from n(n-1) or n^2 (stitching) to nfinding0.768
Key computational efficiency advantage of MAS over traditional model stitching for multi-model comparisons.