Injection Stride

Parameter controlling how often an injection is applied during completion; s=1 injects on every activation, achieving strongest steering

Neighborhood — ranked by edge-count

paper

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Injection stride s=1 produces the highest mean SJT scores across all LLMs; more frequent injection yields stronger steeringfinding0.741
Empirical finding about injection stride parameter; injecting into every completion activation maximizes steering strength
Concept Injectionconcept0.711
Technique of injecting activation patterns associated with specific concepts into a model's internal states to test whether self-reports reflect ground truth.
MDS Injectionmethod0.705
Mean-difference vectors derived from self-statement activations (h_s); best-performing injection method in open-ended generation
Injected thoughts taskmethod0.698
Experimental paradigm where the model is told about the possibility of thought injection and asked to report detection and identification.
L2ZI Injectionmethod0.692
Probe-based injection using L2-regularized logistic regressor with zero intercept on h_b activations
L1ZI Injectionmethod0.691
Probe-based injection using L1-regularized logistic regressor with zero intercept on h_b activations
Residual-Stream Injectionconcept0.690
Core activation intervention: add scaled vector to residual stream at layer l during completion
L2LI Injectionmethod0.689
Probe-based injection using L2-regularized logistic regressor with learned intercept on h_b activations