Intervention Propagation

Property that additive modifications to activations affect all downstream computations, enabling tractable behavioral control

Neighborhood — ranked by edge-count

concept

Residual Stream
associated_with
Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Causal Intervention on Representationsconcept0.803
The use of interventions (rather than correlations) to establish a causal link between representation geometry and behavioral geometry.
Belief Propagationmethod0.796
Inference mechanism underlying active inference; updates posterior beliefs via gradient descent on free energy.
Serializable Interventionconcept0.795
pyvene's approach of storing interventions as shareable serialized objects rather than runtime code
Subspace Interventionconcept0.794
Intervention targeting specific dimensional subsets of activation vectors rather than full representations
Interchange Interventionmethod0.791
Fundamental operation for causal abstraction analysis; forces neurons to take values from source inputs to create counterfactuals.
Serial Interventionconcept0.786
Intervention mode where interventions are applied sequentially, each building on the previous one
Causal Intervention via Activation Shiftingmethod0.783
Method of shifting hidden state activations along probe directions to cause the model to treat false statements as true and vice versa; evaluated on OOD inputs
Back-propagationmethod0.777
Standard learning algorithm for deep neural networks that propagates error signals to adjust weights; lacks convergence guarantee for non-linearly separable functions