Intervenable Model

pyvene class that decorates a torch model with hooks allowing activations to be collected and overwritten

Neighborhood — ranked by edge-count

paper

concept

Intervenable Configuration
uses
Dict-based configuration format in pyvene that outlines which model components will be intervened upon
Getter and Setter Hooks
implements
Two types of hooks implemented by IntervenableModel to save and set activations during forward passes

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

modelconcept0.744
A representation that captures relevant aspects of a system; according to the theorem, the regulator must embody this.
Deterministic Causal Modelconcept0.727
Formal representation of algorithms as directed acyclic graphs computing functions f_A
Language Modelconcept0.723
Primary test domain for manifold steering, including reasoning and ICL tasks
Model Evidenceconcept0.720
Probability of data under the model, penalizing complexity and rewarding accuracy.
model selectionconcept0.719
Comparing models using log-evidence approximated by free energy.
Toy Modelsconcept0.718
Formal Modelmethod0.716
Reasoning Modelsconcept0.716
Class of large language models designed to produce extended chain-of-thought before answering, studied in this paper