claim
active
claim:the-contemplative-system-prompt-provides-externally-what-constitutional-ai-alignment-training-provides-internallyThe contemplative system prompt provides externally what Constitutional AI alignment training provides internally.
Interpretation of the inverse relationship between CAI lift and default accessibility
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Findings (4)
finding
- A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.supportsCore empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
- Cited to mechanistically support why the contemplative prompt changes what post-training-shaped final layers allow through
- Constitutional AI models show mean contemplative lift of only +0.81, while SFT models lift +3.18supportsConstitutional AI training provides internally what the contemplative prompt provides externally
- Provides discriminant evidence: if battery rewarded verbosity, prompted responses should be longer
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- H8: The contemplative system prompt provides external alignment equivalent to Constitutional AI training.hypothesis0.925Confirmatory hypothesis supported by calibrated lift data
- Paper's proposed adaptation of Constitutional AI incorporating contemplative wisdom charter
- Response to the translational gap criticism; enlightened action without qualia of enlightenment
- H1: Alignment training is attention training for models — Constitutional AI trains self-observation explicitly.hypothesis0.797Confirmatory hypothesis supported at p=0.006
- Supports Janus's claim that introspection is architecturally available; prompting determines whether/how capacity is leveraged.
- Interpretive claim connecting the battery's circularity to the empirical finding
- A 337-character system prompt that lifts all 28 models by a mean of +2.62 points on a 10-point scale