concept
active
concept:model-editing

Model Editing

Technique for modifying model knowledge or behavior via targeted interventions, e.g., ROME by Meng et al.

Neighborhood — ranked by edge-count

Concepts (2)

concept

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • modelconcept0.835
    A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.
  • Language Modelsconcept0.800
    Primary substrate for manifold steering experiments; demonstrates method on reasoning and in-context tasks.
  • Model Stitchingmethod0.797
    Technique to measure representational compatibility by integrating intermediate representations of one model into another
  • Model Surgerymethod0.788
    Edits MLP weights for all layers to modify model behavior; used by Abdelnabi & Salem to decrease verbalized evaluation awareness.
  • model selectionconcept0.784
    Comparing models using log-evidence approximated by free energy.
  • Language Modelconcept0.784
    Primary test domain for manifold steering, including reasoning and ICL tasks
  • Technique to alter model behavior by directly editing a parameter subcomponent without training, demonstrated by changing an emoticon eye subcomponent.
  • Actors Modelframework0.774
    A message-passing concurrency model where processes (actors) communicate via messages (talks) and generate new processes; related to concurrent objects.