B10 final accuracy 94.8 ± 1.2%

Accuracy at k=16 shots for B10.

Source paper

extracted_from

(2025) · Edward Yi Chang · Kaya, Zeyneb N. · Ethan Chang

community

Chain-of-Thought reasoning robustness & safety
members_of
CoT effects on generalization, multimodal QA accuracy, and AI safety alignment training.
Multimodal chain-of-thought reasoning benchmarks
members_of
ScienceQA and related vision-language tasks evaluated via explicit reasoning steps, spanning 738M-parameter models with 89-95% accuracy ranges.
Benchmark classification accuracy results
members_of
Three benchmarks (B8, B9, B10) with mean accuracy and standard deviation metrics.

question

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

B8 final accuracy 92.4 ± 1.8%finding0.887
Accuracy at k=16 shots for B8.
B9 final accuracy 89.7 ± 2.1%finding0.867
Accuracy at k=16 shots for B9.
Binary detection adjusted accuracy reaches 97.3% at layer 0 with α=5 before baseline control is appliedfinding0.791
The misleadingly high result that prior paradigm would report as evidence of introspection
B10 shot midpoint k50 = 0.28 ± 0.05 shots with accuracy 94.8 ± 1.2%finding0.763
Lowest threshold condition in E2; near-zero/one-shot threshold consistent with high pretraining density
Binary detection accuracy (up to 97.3% at L0 α=5) is entirely explained by global logit shifts (r=0.999 correlation with control)finding0.748
Core negative result: the binary detection paradigm cannot distinguish genuine introspection from uniform output bias
Strength comparison accuracy averages 47% at layers 15-30, indistinguishable from 50% chancefinding0.735
Shows collapse of introspective capability at later layers in the strength comparison task
B10 phase width Δk = 1.21 ± 0.18finding0.734
Transition width (k90 – k10) for B10.
Top-5 instructions by µ(1→2) at ℓ=12 achieve average cosine similarity .9893 and average accuracy .5645 on gsm8k_adv for Gemma3-4B-ITfinding0.733
High cosine similarity for Gemma3 steering vectors suggests strong linear reflection structure.