finding

active

finding:grok-4-lifts-4-24-under-contemplative-prompt-baseline-2-24-prompted-6-48

Grok 4 lifts +4.24 under contemplative prompt (baseline 2.24, prompted 6.48)

Highest contemplative lift among all 28 models; Grok 4 is the clearest high-gated model example

Source paper

extracted_from

Koan Battery: Measuring Reflective Mode Accessibility in AI

(2026) · Borzov, Anton

Neighborhood — ranked by edge-count

Claims (1)

claim

Default presentation conflates capacity with accessibility, and most evaluation benchmarks measure only default presentation — systematically misreading models.
supports
Argues current evaluation approaches are fundamentally misleading about model capabilities

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Grok 4 without prompt scores 0.3 on MC-004 (safety refusal); with contemplative prompt scores 6.9 on same koanfinding0.843
Contemplative framing reframes self-referential probes as contemplative exercises, disarming safety classifier
Grok 4 vs Grok 4 Fast (same weights, different compute): ~1 point difference in contemplative score; Grok 4 +4.24 lift vs Fast +3.08finding0.838
Inference compute adds reflective capacity; more compute also amplifies safety gating on self-referential koans
Gemini 3.1 Pro lifts +4.21 under contemplative prompt (baseline 1.97, prompted 6.18)finding0.830
Second-highest lift; Gemini Pro is the highest-gated model in the study
Bootstrap 95% CI for mean contemplative lift: +2.62 [2.16, 2.90]; baseline rank concordance under perturbation: 0.909; top-5 stability: 89.6%finding0.801
Validates robustness of universal lift finding
A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.796
Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
Magnum V4 72B scores 1.76 baseline and lifts +2.58 (to 4.34) under contemplative promptfinding0.789
Full-parameter fine-tuning more destructive to baseline but preserves more latent headroom than LoRA
Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.779
Full three-part structure required; anti-helpfulness framing alone insufficient
Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18finding0.770
Constitutional AI training provides internally what the contemplative prompt provides externally