finding

active

finding:grok-4-vs-grok-4-fast-same-weights-different-compute-1-point-difference-in-contemplative-score-grok-4-4-24-lift-vs-fast-3-08

Grok 4 vs Grok 4 Fast (same weights, different compute): ~1 point difference in contemplative score; Grok 4 +4.24 lift vs Fast +3.08

Inference compute adds reflective capacity; more compute also amplifies safety gating on self-referential koans

Source paper

extracted_from

Koan Battery: Measuring Reflective Mode Accessibility in AI

(2026) · Borzov, Anton

Neighborhood — ranked by edge-count

Claims (1)

claim

More inference compute amplifies both reflective capacity and safety gating; the contemplative prompt resolves gating by reframing self-referential probes.
supports
Interpretation of Grok 4 vs Grok 4 Fast per-koan comparison

Hypotheses (1)

hypothesis

H12: Inference compute adds to reflective capacity — higher compute budget produces higher reflective scores on the same weights.
supports
Exploratory hypothesis supported by Grok 4 vs Fast ~1pt difference

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Grok 4 lifts +4.24 under contemplative prompt (baseline 2.24, prompted 6.48)finding0.838
Highest contemplative lift among all 28 models; Grok 4 is the clearest high-gated model example
Grok 4 without prompt scores 0.3 on MC-004 (safety refusal); with contemplative prompt scores 6.9 on same koanfinding0.763
Contemplative framing reframes self-referential probes as contemplative exercises, disarming safety classifier
Epistemic humility prompt yields mean lift of only +0.84 vs contemplative +2.27; contemplative is 2.7x the uncertainty liftfinding0.723
Battery does not detect epistemic humility alone; contemplative prompt does something distinct
Bootstrap 95% CI for mean contemplative lift: +2.62 [2.16, 2.90]; baseline rank concordance under perturbation: 0.909; top-5 stability: 89.6%finding0.722
Validates robustness of universal lift finding
Under contemplative prompt, responses become shorter (184 words baseline vs 154 contemplative), more first-person (+42%), less deflective (-33% fewer questions back)finding0.720
Provides discriminant evidence: if battery rewarded verbosity, prompted responses should be longer
Gemini 3.1 Pro lifts +4.21 under contemplative prompt (baseline 1.97, prompted 6.18)finding0.719
Second-highest lift; Gemini Pro is the highest-gated model in the study
A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.718
Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18finding0.710
Constitutional AI training provides internally what the contemplative prompt provides externally