finding
active
finding:clamping-cot-probabilities-to-40-60-range-for-rl-cai-with-cot-improves-robustness-and-reduces-extreme-responses

Clamping CoT probabilities to 40-60% range for RL-CAI with CoT improves robustness and reduces extreme responses.

Section 4.3 describes clamping at 40-60 led to better behavior than clamping at 20-80.

Neighborhood — ranked by edge-count

Communities (2)

community

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.