hypothesis
active
hypothesis:a-small-number-of-high-quality-human-demonstrations-of-chain-of-thought-reasoning-could-be-used-to-improve-and-focus-performanceA small number of high-quality human demonstrations of chain-of-thought reasoning could be used to improve and focus performance.
Section 6 mentions high-quality human demos could improve natural language feedback.
Source paper
extracted_from(2022) · Bai, Yuntao · Saurav Kadavath · Sandipan Kundu · Amanda Askell +47
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Figure 4 shows CoT improves over zero-shot, and ensembled CoT further boosts accuracy.
- Medium through which eval awareness is often verbalized; target of intervention.
- under what conditions does chain-of-thought reflect genuine uncertainty resolution versus a learned performance?question0.810Key question addressed by the task difficulty analysis comparing MMLU and GPQA-Diamond
- Chain-of-thought prompting elicits reasoning in large language models (Wei et al., 2022)concept0.808Foundational paper on CoT prompting cited as basis for reasoning LLM training
- CoT improves accuracy on HHH evals and makes the decision process legible.
- Central research question motivating the paper
- Technique by which LLMs generate intermediate reasoning steps before final output; used by ChatGPT o3.
- Key mechanistic claim supported by scratchpad modification experiments and conditioning analysis