Model Stitching

Technique to measure representational compatibility by integrating intermediate representations of one model into another

Neighborhood — ranked by edge-count

paper

framework

Model Alignment Search (MAS)
extends
The primary contribution of the paper: a bidirectional causal method that learns rotation matrices for each model to uncover and compare causally relevant latent subspaces across neural networks.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Model Surgerymethod0.814
Edits MLP weights for all layers to modify model behavior; used by Abdelnabi & Salem to decrease verbalized evaluation awareness.
Zero-Shot Model Stitchingconcept0.814
Model stitching without learning a stitching layer, demonstrating strong alignment across different model training regimes
modelconcept0.803
A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.
Model Editingconcept0.797
Technique for modifying model knowledge or behavior via targeted interventions, e.g., ROME by Meng et al.
Toy Modelsconcept0.772
Latent Stitchmethod0.770
Baseline method using a single orthogonal matrix trained to map source latents to target latents via CL auxiliary loss without behavioral objective.
model selectionconcept0.769
Comparing models using log-evidence approximated by free energy.
Model Steeringconcept0.762
Using interventions to guide model generation behavior, e.g., adding sentiment vectors at inference time