question
active
question:why-do-mds-injections-outperform-other-methods-on-the-inventory-multiple-choice-taskWhy do MDS injections outperform other methods on the inventory (multiple-choice) task?
Identified as an unexplained result and open question in limitations section
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Neighborhood — ranked by edge-count
Papers (1)
paper
- Psychological Steering of Large Language Modelsassociated_with
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- MDS injections show no salient patterns in MPI-120 inventory responses beyond occasional co-occurring peaksfinding0.801Contrasts with SJT results; leads authors to narrow analyses to SJT responses
- Unexplained exception identified as a limitation and open question
- MDS injections outperform P2 in open-ended generation in 11 of 14 LLMs with Phi gains of 3.61% to 16.44%finding0.752Primary quantitative result overturning prior reports that prompting outperforms representation engineering
- Qualitative finding demonstrating unique capability of activation-level interventions unavailable to prompting methods including PM
- Do the findings about MDS injection effectiveness generalize to base (non-instruction-tuned) language models?question0.726Acknowledged limitation: only instruction-tuned models were studied
- Theoretical alignment claim backed by OLS R2 analysis showing 96.15% of trends have R2>=0.75
- Mean-difference vectors derived from self-statement activations (h_s); best-performing injection method in open-ended generation
- Identified exception to overall MDS effectiveness; reason remains unexplained as a limitation