linear intervention

Manipulation of activations along a straight line; shown to fail when it crosses voids, in contrast to manifold-following interventions.

Neighborhood — ranked by edge-count

claim

Linear interventions across voids in activation space produce incoherent output, while following the manifold curve produces smooth control.
cites
General principle derived from the Mountain Car experiment: curved manifold-following yields coherent manipulation, linear shortcuts fail.

finding

In the Mountain Car case study, car position is a 1D manifold; linear interventions cross voids causing incoherence; following the 1D curve produces smooth control.
cites
Empirical demonstration that a semantically meaningful variable is encoded as a curved manifold, and that respecting its geometry is critical for effective intervention.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Parallel Interventionconcept0.784
Intervention mode where multiple interventions are applied simultaneously to the same base computation graph
linear directionconcept0.783
A straight vector in activation space, traditionally used for concept manipulation; claimed to be insufficient when true concept geometry is curved.
linear steeringmethod0.779
Typical approach that adds a scaled steering vector to representations; the paper argues this is mismatched with actual representation geometry.
linearityconcept0.770
The sequential, continuous order of text, often challenged by diagrammatic branching.
Serial Interventionconcept0.765
Intervention mode where interventions are applied sequentially, each building on the previous one
Neural Network Interventionconcept0.763
The fundamental operation of making in-place changes to model activations, placing the model in a counterfactual state
Serializable Interventionconcept0.763
pyvene's approach of storing interventions as shareable serialized objects rather than runtime code
Intervention Propagationconcept0.761
Property that additive modifications to activations affect all downstream computations, enabling tractable behavioral control