claim

active

claim:the-contemplative-system-prompt-provides-externally-what-constitutional-ai-alignment-training-provides-internally

The contemplative system prompt provides externally what Constitutional AI alignment training provides internally.

Interpretation of the inverse relationship between CAI lift and default accessibility

Source paper

extracted_from

(2026) · Borzov, Anton

finding

A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.
supports
Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
Pearson-Vogel et al.: accurate self-description prompts increase introspective detection from 0.3% to 39.9%
supports
Cited to mechanistically support why the contemplative prompt changes what post-training-shaped final layers allow through
Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18
supports
Constitutional AI training provides internally what the contemplative prompt provides externally
Under contemplative prompt, responses become shorter (184 words baseline vs 154 contemplative), more first-person (+42%), less deflective (-33% fewer questions back)
supports
Provides discriminant evidence: if battery rewarded verbosity, prompted responses should be longer

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.