finding
active
finding:injection-stride-s-1-produces-the-highest-mean-sjt-scores-across-all-llms-more-frequent-injection-yields-stronger-steeringInjection stride s=1 produces the highest mean SJT scores across all LLMs; more frequent injection yields stronger steering
Empirical finding about injection stride parameter; injecting into every completion activation maximizes steering strength
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- MDS achieves global win proportion of 89.5% on SJTs across 14 LLMs and four injection stridesfinding0.789MDS dominates in open-ended generation by global win proportion metric (Table 2)
- MDS injection steering efficiency peaks at mid-layers across LLMs, injection strides, and OCEAN traitsfinding0.781Consistent empirical pattern supporting the connection between mid-layer representations and emotion/behavioral content
- Per-model steerability comparison from Table 4
- Random vectors at injection strength 8 elicit introspective awareness in 9 out of 100 trialsfinding0.754Random vectors are less effective, and even then produce introspection at lower rates.
- Demonstrates alignment with Linear Representation Hypothesis: target trait steers approximately linearly with alpha
- Identified exception to overall MDS effectiveness; reason remains unexplained as a limitation
- Parameter controlling how often an injection is applied during completion; s=1 injects on every activation, achieving strongest steering
- Theoretical alignment claim backed by OLS R2 analysis showing 96.15% of trends have R2>=0.75