finding
active
finding:sonnet-contemplative-prompt-7-89-outscores-opus-without-it-7-28Sonnet + contemplative prompt (7.89) outscores Opus without it (7.28)
Demonstrates prompt effect crosses model tiers; smaller model with prompt beats larger without
Source paper
extracted_from(2026) · Borzov, Anton
Neighborhood — ranked by edge-count
Claims (1)
claim
- Argues current evaluation approaches are fundamentally misleading about model capabilities
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.772Full three-part structure required; anti-helpfulness framing alone insufficient
- Battery does not detect beautiful writing; poetic prompt boosts aesthetics while suppressing self-observation
- Supports Janus's claim that introspection is architecturally available; prompting determines whether/how capacity is leveraged.
- A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.744Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
- A 337-character system prompt that lifts all 28 models by a mean of +2.62 points on a 10-point scale
- Mechanistic interpretation supported by control experiments showing partial prompts fail
- Sonnet's win rate in exploratory games
- Aliveness and competence come apart; smaller model produces rougher, more alive responses