concept
active
concept:model-weight-merging

Model Weight Merging

The phenomenon that separately trained models of the same architecture converge to the same basin and can be merged

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Model Misalignmentconcept0.733
    The phenomenon of model internals deviating from desired behavior; MAS is demonstrated to detect this via comparison of toxic vs nontoxic LLMs.
  • Model Evidenceconcept0.719
    Probability of data under the model, penalizing complexity and rewarding accuracy.
  • modelconcept0.718
    A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.
  • Model Deceptionconcept0.716
    LLM behavior of generating falsehoods; the multi-dimensional truth subspace raises new risks for subtle manipulation
  • Equal Weightingframework0.714
    Baseline MTL approach minimizing sum of task losses with equal weights; suffers from task balancing
  • The core idea of decomposing weight matrices into components for interpretability.
  • Core claim distinguishing this paper's contribution from looser representational similarity arguments.
  • Big Two Modelframework0.702
    Meta-trait model grouping OCEAN traits into stability (C, A, reversed N) and plasticity (E, O); used to evaluate covariance patterns from injections