concept
active
concept:intervenable-model

Intervenable Model

pyvene class that decorates a torch model with hooks allowing activations to be collected and overwritten

Neighborhood — ranked by edge-count

Concepts (2)

concept
  • Dict-based configuration format in pyvene that outlines which model components will be intervened upon
  • Two types of hooks implemented by IntervenableModel to save and set activations during forward passes

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • modelconcept0.744
    A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.
  • Formal representation of algorithms as directed acyclic graphs computing functions f_A
  • Language Modelconcept0.723
    Primary test domain for manifold steering, including reasoning and ICL tasks
  • Model Evidenceconcept0.720
    Probability of data under the model, penalizing complexity and rewarding accuracy.
  • model selectionconcept0.719
    Comparing models using log-evidence approximated by free energy.
  • Toy Modelsconcept0.718
  • Formal Modelmethod0.716
  • Reasoning Modelsconcept0.716
    Class of large language models designed to produce extended chain-of-thought before answering, studied in this paper