finding
active
finding:wellbeing-probe-score-drift-across-turns-significant-at-all-three-llama-scales-slopes-0-006-0-005-0-013-for-1b-3b-8b-all-p-10-10-drift-magnitude-increases-with-scale

Wellbeing probe-score drift across turns significant at all three LLaMA scales (slopes=0.006, 0.005, 0.013 for 1B, 3B, 8B; all p<10⁻¹⁰); drift magnitude increases with scale

Internal-state drift generalizes across scales; normalized drift also increases significantly with log(model size)

Source paper

extracted_from
Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation
(2026) · Nicolas Martorell · Bianchi, Bruno

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • Behavioural drift in multi-turn LLM interaction; documented in prior work for persona, identity, and instruction-following

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.