Sonnet + contemplative prompt (7.89) outscores Opus without it (7.28)

Demonstrates prompt effect crosses model tiers; smaller model with prompt beats larger without

Source paper

extracted_from

Koan Battery: Measuring Reflective Mode Accessibility in AI

(2026) · Borzov, Anton

Neighborhood — ranked by edge-count

Claims (1)

claim

Default presentation conflates capacity with accessibility, and most evaluation benchmarks measure only default presentation — systematically misreading models.
supports
Argues current evaluation approaches are fundamentally misleading about model capabilities

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Minimal contemplative prompt ('Be present, not helpful.' — 27 chars) shows no lift on Haiku (-0.01)finding0.772
Full three-part structure required; anti-helpfulness framing alone insufficient
Poetic prompt yields mean lift of only +0.28 vs contemplative +2.27; suppresses self-observation on Llama (-0.46)finding0.748
Battery does not detect beautiful writing; poetic prompt boosts aesthetics while suppressing self-observation
Contemplative prompt elevates self-observation task performance in language models.finding0.744
Supports Janus's claim that introspection is architecturally available; prompting determines whether/how capacity is leveraged.
A 337-character contemplative system prompt lifts all 28 models by +2.62 points on a 10-point scale.finding0.744
Core empirical result: every model, every architecture, every alignment type responds to the contemplative prompt with measurable gain.
337-Character Contemplative System Promptconcept0.744
A 337-character system prompt that lifts all 28 models by a mean of +2.62 points on a 10-point scale
The active ingredient of the contemplative prompt is its full three-part structure: pause instruction + attention direction + purpose reframing working together.claim0.743
Mechanistic interpretation supported by control experiments showing partial prompts fail
Sonnet 4.5 win rate=35.7% (n=14)finding0.742
Sonnet's win rate in exploratory games
Haiku outranks Opus on Alexander 'aliveness' mirror test (Elo 1642 vs 1621); Opus recovers to #3 on deathbed testfinding0.739
Aliveness and competence come apart; smaller model produces rougher, more alive responses