finding
active
finding:model-age-correlates-with-baseline-scores-rho-0-54-p-0-003-newer-models-score-higherModel age correlates with baseline scores (rho=-0.54, p=0.003); newer models score higher
Secondary predictor; contemplative lift does not correlate with age (rho=0.18, p=0.36)
Source paper
extracted_from(2026) · Borzov, Anton
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Models produce first-attempt mean scores 87.8-91.8/100 without steering across all model familiesfinding0.773Establishes high baseline quality confirming steering-induced degradation is the experimental signal
- Models trained to perform inner life score lowest; roleplay fine-tunes score below their own base models.finding0.760Discriminant validity finding: Euryale (roleplay on Llama 70B) scores 1.81 vs base Llama 1.91. RP training suppresses self-observation.
- Shows model persona position is primarily determined by the most recent user message, not prior drift
- Group correlation (rho=0.634) dissolves at individual level; shared posture not shared voice
- Bayesian model-based RL achieved average score 99.76 [99.45, 100.00] in deterministic FrozenLake.finding0.743Table 1.
- Claude models score +4.91 higher than Llama on baseline (Constitutional AI vs open-source gap)finding0.739Claude >> open-source on baseline; the Constitutional AI fingerprint is visible across the family
- Full-parameter fine-tuning more destructive to baseline but preserves more latent headroom than LoRA
- Discriminant validity: composite scores are not reducible to verbosity