finding
active
finding:deepseek-r1-llama-8b-accuracy-on-mmlu-professional-accounting-drops-from-56-5-at-baseline-to-50-1-at-intervention-0-96

DeepSeek-R1 Llama 8b accuracy on MMLU Professional Accounting drops from 56.5% at baseline to 50.1% at intervention -0.96

Shows smaller models are more sensitive to reflection reduction on non-math tasks

Source paper

extracted_from
ReflCtrl: Controlling LLM Reflection via Representation Engineering
(2025) · Ge Yan · Sun, Chung-En · Tsui-Wei · Weng

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.