claim

active

claim:mas-is-a-more-causally-focused-choice-than-model-stitching-for-addressing-questions-of-how-behaviorally-relevant-information-is-encoded-in-different-neural-systems

MAS is a more causally focused choice than model stitching for addressing questions of how behaviorally relevant information is encoded in different neural systems

Core interpretive claim supported by the formal analysis showing MAS does not exploit the behavioral null space unlike stitching.

Source paper

extracted_from

Model Alignment Search

(2025) · Satchel Grant

Neighborhood — ranked by edge-count

Findings (2)

finding

GRU behavior can be compressed to as few as 4 dimensions using DAS and MAS with comparable IIAs
supports
Shows that behaviorally relevant information is low-dimensional; contrasted with model stitching achieving near-perfect IIA at rank 2.
MAS reduces number of required alignment matrices for n-model comparison from n(n-1) or n^2 (stitching) to n
supports
Key computational efficiency advantage of MAS over traditional model stitching for multi-model comparisons.

Claims (1)

claim

Model stitching can use the behavioral null space of the source model when mapping to the target, making successful stitching insufficient evidence of representational similarity
supports
Formal analysis showing the theoretical limitation of model stitching as a similarity measure.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

MAS-like methods could potentially be used to directly constrain model internals to be non-toxicclaim0.787
Speculative forward-looking claim about practical applications of MAS for model alignment.
Using more than two models in a MAS comparison could harm alignment due to conflicting loss gradients, or could assist in isolating causal subspaceshypothesis0.785
Open question raised in the paper about scaling MAS beyond two models.
Including within-model interventions (i=j) in MAS training adds a soft constraint encouraging separation of causal from extraneous subspacesclaim0.777
Methodological claim about why within-model interchange interventions are essential to the MAS training procedure.
MAS successfully aligns behavior between Multi-Object GRU models in both embedding and hidden state layers with high IIAfinding0.774
Demonstrates MAS's ability to bidirectionally transfer behavior where RSA shows low embedding correlation.
Model Alignment Search (MAS)framework0.763
The primary contribution of the paper: a bidirectional causal method that learns rotation matrices for each model to uncover and compare causally relevant latent subspaces across neural networks.
Zero-shot model stitching without a learned stitching layer is feasible because different text models embed data in remarkably similar waysclaim0.757
Strong evidence for representational alignment across models
Zero-shot model stitching without learning a stitching layer is feasible across different text models trained on different modalitiesfinding0.754
Moschella et al. result cited as evidence of representational convergence across models
When a model discovers that its outputs produce effects, it accelerates learning through in-context learning, analogous to lucid dreaming.claim0.753
Describes scaffolding method and the model's meta-learning loop.