concept
active
concept:chain-of-thoughtchain-of-thought
A technique that outputs intermediate reasoning steps, used here to detect verbalized eval awareness.
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (4)
concept
- Performative chain-of-thoughtrelated_toCentral concept: verbalized reasoning that occurs after the model has already internally settled on an answer, particularly on easier tasks.
- Chain-of-Thought Reasoningrelated_toMedium through which eval awareness is often verbalized; target of intervention.
- Factored cognition / chain-of-thoughtrelated_toUsing multi-step reasoning by generating intermediate thoughts.
- verbalized eval awarenessassociated_withThe phenomenon where a model explicitly states in its chain-of-thought that it is being evaluated, tested, or benchmarked.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Technique by which LLMs generate intermediate reasoning steps before final output; used by ChatGPT o3.
- A prompting technique that elicits intermediate reasoning steps before final answer inference in language models.
- Phenomenon where steering vector intervention causes model's final output to contradict its own explicitly honest reasoning conclusion
- The hidden reasoning steps generated by recent LLMs before visible output; mentioned in the technology section.
- Chain-of-thought prompting elicits reasoning in large language models (Wei et al., 2022)concept0.788Foundational paper on CoT prompting cited as basis for reasoning LLM training
- Cited regarding possibility of encoding misaligned reasoning in benign chains-of-thought
- under what conditions does chain-of-thought reflect genuine uncertainty resolution versus a learned performance?question0.761Key question addressed by the task difficulty analysis comparing MMLU and GPQA-Diamond
- A small number of high-quality human demonstrations of chain-of-thought reasoning could be used to improve and focus performance.hypothesis0.760Section 6 mentions high-quality human demos could improve natural language feedback.