claim
active
claim:reasoning-models-generate-performative-cot-tokens-after-achieving-strong-confidence-in-their-final-answer-without-revealing-this-belief-in-text

Reasoning models generate performative CoT tokens after achieving strong confidence in their final answer without revealing this belief in text

The central empirical claim of the paper, supported by activation probing evidence

Source paper

extracted_from
Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought
(2026) · Siddharth Boppana · Annabel Ma · Max Loeffler · Raphaël Sarfati +4

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.