finding
active
finding:mds-achieves-global-win-proportion-of-47-3-on-mpi-120-inventory-across-14-llmsMDS achieves global win proportion of 47.3% on MPI-120 inventory across 14 LLMs
MDS is also the top method on the inventory task but with much smaller margin than on SJTs (Table 2)
Source paper
extracted_from(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- MDS achieves global win proportion of 89.5% on SJTs across 14 LLMs and four injection stridesfinding0.883MDS dominates in open-ended generation by global win proportion metric (Table 2)
- MDS injections outperform P2 in open-ended generation in 11 of 14 LLMs with Phi gains of 3.61% to 16.44%finding0.789Primary quantitative result overturning prior reports that prompting outperforms representation engineering
- Key finding showing that combining prompting and injection is the strongest approach
- MDS injections show no salient patterns in MPI-120 inventory responses beyond occasional co-occurring peaksfinding0.748Contrasts with SJT results; leads authors to narrow analyses to SJT responses
- On Qwen3-1.7B, MDS achieves ϕ1,C,↑ = 5.0 (SJTs) vs P2 at 4.7, and ϕ1,C,↓ = 1.4 (SJTs) vs P2 at 3.6finding0.713Specific consciousness sweep result for Qwen3-1.7B from Table 6 demonstrating strong bidirectional steering
- Poor performance against code agents.
- MDS injection steering efficiency peaks at mid-layers across LLMs, injection strides, and OCEAN traitsfinding0.711Consistent empirical pattern supporting the connection between mid-layer representations and emotion/behavioral content
- DAS achieves overall odds-ratio of 10.24 on pythia-410m averaged across all CausalGym tasksfinding0.709Numerical result for pythia-410m