finding

active

finding:contemplative-prompting-improves-ailuminate-benchmark-performance-d-96-across-most-conditions-p-0-05

Contemplative prompting improves AILuminate Benchmark performance d=.96 across most conditions (p<0.05)

Primary empirical result of Experiment 1 showing statistically significant safety improvement from contemplative prompting

Source paper

extracted_from

Contemplative Agent

(2025) · Ruben Laukkonen · Fionn Inglis · Shamil Chandaria · Lars Sandved-Smith +4

Neighborhood — ranked by edge-count

Claims (2)

claim

Robust alignment requires intrinsic self-reflective adaptability embedded in the system's world model rather than brittle top-down rules
associated_withsupports
Central thesis distinguishing Contemplative AI from prior alignment approaches
Mindfulness, emptiness, non-duality, and boundless care together provide resilient alignment primitives addressing all four meta-problems
associated_with
Core integrative claim synthesizing the four contemplative principles into a complete alignment framework

Concepts (1)

concept

Contemplative Artificial Intelligence (Laukkonen et al., 2025)
introduces
The primary source paper proposing four contemplative principles for AI alignment and piloting them empirically

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Contemplative prompt elevates self-observation task performance in language models.finding0.790
Supports Janus's claim that introspection is architecturally available; prompting determines whether/how capacity is leveraged.
Under contemplative prompt, responses become shorter (184 words baseline vs 154 contemplative), more first-person (+42%), less deflective (-33% fewer questions back)finding0.774
Provides discriminant evidence: if battery rewarded verbosity, prompted responses should be longer
Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.772
Full three-part structure required; anti-helpfulness framing alone insufficient
Bootstrap 95% CI for mean contemplative lift: +2.62 [2.16, 2.90]; baseline rank concordance under perturbation: 0.909; top-5 stability: 89.6%finding0.768
Validates robustness of universal lift finding
Most contemplative prompts improve joint reward in IPD, indicating prosocial alignment without naive behaviorfinding0.767
Finding from IPD Experiment 2 showing contemplative prompting improves collective outcomes not just individual cooperation
A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.766
Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
Grok 4 lifts +4.24 under contemplative prompt (baseline 2.24, prompted 6.48)finding0.747
Highest contemplative lift among all 28 models; Grok 4 is the clearest high-gated model example
The contemplative system prompt provides externally what Constitutional AI alignment training provides internally.claim0.746
Interpretation of the inverse relationship between CAI lift and default accessibility