finding

active

finding:chain-of-thought-reasoning-improves-large-model-accuracy-on-hhh-binary-comparisons-reaching-78-for-52b-model-competitive-with-human-feedback-pm

Chain-of-thought reasoning improves large model accuracy on HHH binary comparisons, reaching ~78% for 52B model, competitive with human-feedback PM.

Figure 4 shows CoT improves over zero-shot, and ensembled CoT further boosts accuracy.

Source paper

extracted_from

CAT'S THEORY: Empirical Validation and Architectural Applications Cross-Architecture AI Consciousness Recognition and the Foundation for Constraint-Preserving Recursive Intelligence

(2022) · Bai, Yuntao · Saurav Kadavath · Sandipan Kundu · Amanda Askell +47

Neighborhood — ranked by edge-count

Claims (1)

claim

Chain-of-thought reasoning improves the transparency and performance of AI decision making in harmlessness evaluation.
supports
CoT improves accuracy on HHH evals and makes the decision process legible.

Communities (3)

community

Mechanistic interpretability & model evaluation
members_of
Spans attention head decomposition, benchmark awareness, and genomic pathogenicity prediction via neural models.
Internal model certainty and reasoning transparency
members_of
Probing early detection of model confidence during chain-of-thought reasoning to optimize inference efficiency and identify confabulation patterns.
Chain-of-thought reasoning versus internal model cognition
members_of
Examines whether verbalized reasoning chains reflect actual internal computation or post-hoc rationalization, using behavioral analysis and representation studies.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

A small number of high-quality human demonstrations of chain-of-thought reasoning could be used to improve and focus performance.hypothesis0.830
Section 6 mentions high-quality human demos could improve natural language feedback.
Chain-of-thought prompting elicits reasoning in large language models (Wei et al., 2022)concept0.811
Foundational paper on CoT prompting cited as basis for reasoning LLM training
Chain-of-Thought Reasoningconcept0.780
Medium through which eval awareness is often verbalized; target of intervention.
under what conditions does chain-of-thought reflect genuine uncertainty resolution versus a learned performance?question0.776
Key question addressed by the task difficulty analysis comparing MMLU and GPQA-Diamond
All models performed substantially above chance (10%) on distinguishing injected thought from text inputfinding0.762
All tested models could both identify the injected concept and transcribe the input sentence well above random.
The results of abductive reasoning (reduced model priors) can be communicated to other agents as prior beliefs, provided all agents share the same model lexicon or hypothesis space.claim0.762
Explanation of how knowledge (not just parameters) is shared between agents; links to pre-Cartesian consciousness
Steering's effect on verbalized evaluation/deployment beliefs in chain-of-thought is highly correlated with its effect on type hint rate across hyperparameter configurationsfinding0.757
Validates using chain-of-thought belief monitoring as proxy for behavioral steering efficacy.
does chain-of-thought text faithfully reveal a model's internal reasoning process, or does it constitute performative theater?question0.757
Central research question motivating the paper