finding
active
finding:epistemic-humility-prompt-yields-mean-lift-of-only-0-84-vs-contemplative-2-27-contemplative-is-2-7x-the-uncertainty-liftEpistemic humility prompt yields mean lift of only +0.84 vs contemplative +2.27; contemplative is 2.7x the uncertainty lift
Battery does not detect epistemic humility alone; contemplative prompt does something distinct
Source paper
extracted_from(2026) · Borzov, Anton
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Battery does not detect beautiful writing; poetic prompt boosts aesthetics while suppressing self-observation
- Validates robustness of universal lift finding
- A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.789Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
- Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18finding0.788Constitutional AI training provides internally what the contemplative prompt provides externally
- Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.775Full three-part structure required; anti-helpfulness framing alone insufficient
- Second-highest lift; Gemini Pro is the highest-gated model in the study
- Highest contemplative lift among all 28 models; Grok 4 is the clearest high-gated model example
- Evidence of a bottleneck between richer internal variation and final report distribution in impulsivity→interest condition