finding
active
finding:qwq-32b-on-math-500-21-0-reasoning-token-reduction-at-intervention-strength-0-96-with-only-0-34-accuracy-lossQwQ-32B on MATH-500: 21.0% reasoning token reduction at intervention strength -0.96 with only 0.34% accuracy loss
Demonstrates reflection redundancy in stronger model on harder math benchmark
Source paper
extracted_from(2025) · Ge Yan · Sun, Chung-En · Tsui-Wei · Weng
Neighborhood — ranked by edge-count
Claims (1)
claim
- Reflections are redundant in many cases, especially in stronger modelsassociated_withsupportsKey interpretive finding that stronger models can have reflections reduced with minimal accuracy cost
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates reflection redundancy in larger models on non-mathematical reasoning
- Demonstrates that stronger models are largely insensitive to reflection manipulation
- Proposed explanation for why single-turn reformulation improves performance: models' training distribution is concentrated on single-turn reasoning.
- Out-of-domain generalization showing deception features track general representational honesty
- Critical finding showing steering vectors can produce unfaithful CoT where harmful choices are obscured in reasoning
- Experiment 4 result ruling out semantic priming as explanation for the experimental effect
- Distinguishes strategic threat-based deception from instructed deception in representational structure
- Table 2, row 3, showing equivalence when prior preferences match rewards.