Zero-Shot Model Stitching

Model stitching without learning a stitching layer, demonstrating strong alignment across different model training regimes

Neighborhood — ranked by edge-count

claim

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Zero-shot model stitching without learning a stitching layer is feasible across different text models trained on different modalitiesfinding0.874
Moschella et al. result cited as evidence of representational convergence across models
Model Stitchingmethod0.814
Technique to measure representational compatibility by integrating intermediate representations of one model into another
Zero-Shot Generalizationconcept0.756
Ability to predict correctly for stimulus-action pairs never previously experienced by inferring structural rules; key measure for TEM-t performance.
Model stitching can use the behavioral null space of the source model when mapping to the target, making successful stitching insufficient evidence of representational similarityclaim0.753
Formal analysis showing the theoretical limitation of model stitching as a similarity measure.
zero-shot predictionconcept0.740
Prediction without task-specific training; Evee achieves 0.991 AUROC on indels in zero-shot mode.
Stitch (baseline)method0.738
Baseline model stitching trained in a single behavioral direction without CL auxiliary loss, used for comparison with CLMAS.
Latent Stitchmethod0.727
Baseline method using a single orthogonal matrix trained to map source latents to target latents via CL auxiliary loss without behavioral objective.
Zero-Shot Control Conditionconcept0.727
Control omitting any induction and presenting only the final experiential query