finding

active

finding:qwq-32b-accuracy-on-mmlu-formal-logic-stays-between-95-5-and-96-3-across-all-intervention-strengths-while-tokens-reduced-from-1716-6-to-1481-4-at-0-96

QwQ-32B accuracy on MMLU Formal Logic stays between 95.5% and 96.3% across all intervention strengths while tokens reduced from 1716.6 to 1481.4 at -0.96

Demonstrates reflection redundancy in larger models on non-mathematical reasoning

Source paper

extracted_from

ReflCtrl: Controlling LLM Reflection via Representation Engineering

(2025) · Ge Yan · Sun, Chung-En · Tsui-Wei · Weng

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

QwQ-32B on MATH-500: 21.0% reasoning token reduction at intervention strength -0.96 with only 0.34% accuracy lossfinding0.837
Demonstrates reflection redundancy in stronger model on harder math benchmark
QwQ-32B accuracy on GSM8k remains between 96.36% and 96.50% across all intervention strengths (-0.96 to +0.48)finding0.831
Demonstrates that stronger models are largely insensitive to reflection manipulation
DeepSeek-R1 Llama 8b accuracy on MMLU Professional Accounting drops from 56.5% at baseline to 50.1% at intervention -0.96finding0.810
Shows smaller models are more sensitive to reflection reduction on non-math tasks
Up to 33.6% reasoning tokens saved on MMLU subsets with stepwise steering while maintaining accuracy in larger modelsfinding0.788
Maximum token savings achieved by ReflCtrl on non-mathematical general reasoning tasks
Suppression of deception features produces higher TruthfulQA accuracy (M=0.44) than amplification (M=0.20), t(816)=6.76, p=1.5×10⁻¹⁰ across 29 categoriesfinding0.765
Out-of-domain generalization showing deception features track general representational honesty
MM probe trained on likely dataset achieves NIE of 0.70 (false→true) on LLaMA-2-13B, surprisingly strong but weaker than truth probesfinding0.763
Likely-trained MM probe is a surprisingly effective causal baseline due to correlation between truth and probability on sp_en_trans
DeepSeek-R1 Llama 8b gains 0.16% accuracy on GSM8k with positive intervention (more reflections) at cost of ~2000 additional tokensfinding0.758
Only model showing marginal benefit from increased reflection, at substantial token cost
LAT achieves 89% accuracy in detecting strategic deception in QwQ-32B activationsfinding0.754
Core detection result showing LAT-based steering vectors can identify deceptive states with high accuracy