concept
active
concept:intervention-propagation

Intervention Propagation

Property that additive modifications to activations affect all downstream computations, enabling tractable behavioral control

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Residual Stream
    associated_with
    Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • The use of interventions (rather than correlations) to establish a causal link between representation geometry and behavioral geometry.
  • Inference mechanism underlying active inference; updates posterior beliefs via gradient descent on free energy.
  • pyvene's approach of storing interventions as shareable serialized objects rather than runtime code
  • Intervention targeting specific dimensional subsets of activation vectors rather than full representations
  • Fundamental operation for causal abstraction analysis; forces neurons to take values from source inputs to create counterfactuals.
  • Intervention mode where interventions are applied sequentially, each building on the previous one
  • Method of shifting hidden state activations along probe directions to cause the model to treat false statements as true and vice versa; evaluated on OOD inputs
  • Back-propagationmethod0.777
    Standard learning algorithm for deep neural networks that propagates error signals to adjust weights; lacks convergence guarantee for non-linearly separable functions