finding
active
finding:grok-4-lifts-4-24-under-contemplative-prompt-baseline-2-24-prompted-6-48Grok 4 lifts +4.24 under contemplative prompt (baseline 2.24, prompted 6.48)
Highest contemplative lift among all 28 models; Grok 4 is the clearest high-gated model example
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Claims (1)
claim
- Argues current evaluation approaches are fundamentally misleading about model capabilities
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Contemplative framing reframes self-referential probes as contemplative exercises, disarming safety classifier
- Inference compute adds reflective capacity; more compute also amplifies safety gating on self-referential koans
- Second-highest lift; Gemini Pro is the highest-gated model in the study
- Validates robustness of universal lift finding
- A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.796Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
- Full-parameter fine-tuning more destructive to baseline but preserves more latent headroom than LoRA
- Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.779Full three-part structure required; anti-helpfulness framing alone insufficient
- Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18finding0.770Constitutional AI training provides internally what the contemplative prompt provides externally