method
active
method:clamping-cot-probabilities-to-40-60Clamping CoT probabilities to 40-60%
A technique to avoid overconfident preference labels when using chain-of-thought, clamping within 40-60% range.
Neighborhood — ranked by edge-count
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Section 4.3 describes clamping at 40-60 led to better behavior than clamping at 20-80.
- State-of-the-art result on ScienceQA; represents +3.91% improvement over prior best published result of 86.54%.
- Empirical evidence that naive one-stage CoT fails in language-only setting; two-stage + vision achieves state-of-the-art.
- Interpretation of scope generalization results
- Validates robustness of universal lift finding
- CoT increases dr for OOD operands.
- Calibration finding for choosing the activation cap threshold
- Section 4.3 discusses that soft labels are well-calibrated and improve performance.