finding
active
finding:triggered-reflection-with-alternatively-achieves-accuracy-684-on-gsm8k-adv-for-gemma3-4b-itTriggered Reflection with 'Alternatively' achieves accuracy .684 on gsm8k_adv for Gemma3-4B-IT
Highest single-instruction accuracy result in the paper.
Source paper
extracted_from(2025) · Chang, Fu-Chieh · Lee, Yu-Ting · Wu, Pei-Yuan
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Baseline accuracy when reflection is suppressed.
- Empirical interpretation of which reference baseline yields more useful steering vectors.
- Supports claim that uncertainty is encoded in reflection direction
- Core empirical result validating the three-level reflection framework on code reasoning.
- High cosine similarity for Gemma3 steering vectors suggests strong linear reflection structure.
- SOO fine-tuning did not collapse Gemma-2-27B self-other distinction needed for perspective-taking
- Demonstrates that stronger models are largely insensitive to reflection manipulation
- Easy questions (acc > 80%) have average reflection rate of 25.8% for DeepSeek-R1 Llama 8b on GSM8kfinding0.764Baseline reflection rate for easy questions confirming difficulty-reflection correlation