method
active
method:cot-monitorCoT Monitor
Named method for monitoring chain-of-thought text to detect when the model signals its answer, compared against activation probes
Neighborhood — ranked by edge-count
Papers (1)
paper
Frameworks (1)
framework
- The conceptual framework introduced by the paper distinguishing performative CoT from genuine reasoning using activation probing
Findings (1)
finding
- Comparative finding establishing activation probing as superior to text-level monitoring for early belief detection
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Synchronization construct encapsulating shared data and protected access routines.
- A prompting technique that elicits intermediate reasoning steps before final answer inference in language models.
- A two-stage framework that separates rationale generation and answer inference by incorporating vision and language modalities.
- Fine-tuning with chain-of-thought rationales aiming to reduce dr via procedural alignment.
- The ACC's detection of response conflict, shown to be constitutively emotional rather than purely cognitive
- Authors' assertion of novelty and priority; appears in contributions and Table 1.
- Moments of behavioral change in CoT (e.g., backtracking, 'aha' moments) that the paper finds correlate with genuine belief shifts