Residual-Stream Injection

Core activation intervention: add scaled vector to residual stream at layer l during completion

Neighborhood — ranked by edge-count

method

L1LI Injection
implements
Probe-based injection using L1-regularized logistic regressor with learned intercept on h_b activations
L2LI Injection
implements
Probe-based injection using L2-regularized logistic regressor with learned intercept on h_b activations
MDS Injection
implements
Mean-difference vectors derived from self-statement activations (h_s); best-performing injection method in open-ended generation
MDB Injection
implements
Mean-difference vectors derived from Yes/No binary-prefill activations (h_b)

concept

Residual Stream
related_to
Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Residual Stream Activationconcept0.837
The intermediate representations in transformer layers whose activations are patched and probed for truth information
Residual Stream Patchingmethod0.821
Technique to localize causally implicated hidden states by swapping residual stream activations between a true and false input and measuring downstream log-probability changes
Residual Stream Bandwidthconcept0.810
The finite dimensional capacity of the residual stream for storing and communicating information between layers; conceptualized as being under high demand
Residual Stream Activation Patchingmethod0.803
Used to localize causally implicated hidden states by swapping activations between true and false inputs
Superposition in Residual Streamconcept0.794
The phenomenon where the residual stream communicates many more features than its dimensionality by encoding information across overlapping subspaces
layer 40 residual-stream activationsconcept0.778
The specific neural network layer from which activations are extracted for probe construction and SAE training in the target models
residual stream recovery dynamicsconcept0.777
The network's tendency to actively attenuate injected perturbations over subsequent layers, erasing the signal before output
residual stream recovery trackingmethod0.773
Tracks cosine similarity, norm ratio, and injection direction projection across layers to measure recovery from perturbation