finding

active

finding:on-qwen3-1-7b-mds-achieves-1-c-5-0-sjts-vs-p2-at-4-7-and-1-c-1-4-sjts-vs-p2-at-3-6

On Qwen3-1.7B, MDS achieves ϕ1,C,↑ = 5.0 (SJTs) vs P2 at 4.7, and ϕ1,C,↓ = 1.4 (SJTs) vs P2 at 3.6

Specific consciousness sweep result for Qwen3-1.7B from Table 6 demonstrating strong bidirectional steering

Source paper

extracted_from

Psychological Steering of Large Language Models

(2026) · Leonardo Blas · Robin Jia · Emilio Ferrara

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

PM achieves overall SJT steerability Phi=9.6 on gemma-3-12b-it vs MDS=8.7 and P2=8.3finding0.793
Per-model steerability comparison from Table 4
MDS achieves global win proportion of 89.5% on SJTs across 14 LLMs and four injection stridesfinding0.765
MDS dominates in open-ended generation by global win proportion metric (Table 2)
On SWE-bench, harness-benefit peaks at Qwen3-235B (19.3 pp), while weaker Qwen3-32B gains only 4.4 pp and stronger Opus 4.6 gains only 2.6 ppfinding0.764
Core finding demonstrating non-monotonic relationship between base capability and harness-benefit
Qwen-2.5-3B ASR drops from 98.6% at dim 1 to 45.1% at dim 2, recovering partially then declining to 65.3% at dim 5finding0.764
Smaller models show non-monotonic and diminished ASR with increasing cone dimensionality
DB-MTL achieves ∆p = +1.15±0.16 on NYUv2, outperforming all baselines including state-of-the-artfinding0.761
Primary empirical validation on scene understanding task
Qwen3-32B achieves a skill-load rate of 0.251, while Opus 4.6, Sonnet 4.6, and Qwen3-235B achieve SLR of 0.957–0.961finding0.760
Quantifies harness activation failure for weak-tier models vs. strong-tier models
Opus 4.6 achieves HFR of 0.757 while Qwen3-32B achieves HFR of only 0.142 on SkillsBenchfinding0.758
Quantifies harness adherence failure gap between strong and weak tier models
DB-MTL with SegNet backbone achieves Δp = +8.91 on NYUv2, best among all methods.finding0.757
Performance with a different backbone network.