thinker:imagoImago
Participant in Anima Labs conversation discussing autoregressive recurrence.
Authored papers (1)
Emotion features in large language models are bursty but not strictly locally scoped: they exhibit long-tail persistence extending well beyond 100 tokens, and this persistence is specifically tied to emotional content rather than being an artifact of activation variance or autoregressive dynamics. Across 240 multi-turn conversations per model, 171 emotion probes yield token-0-to-token-100 correlations of 0.214 in Cogito v2.1 and 0.367 in Kimi K2.5, compared to only 0.099 and 0.117 for random unit vectors in the same 7168-dimensional layer-40 activation space. After variance-matching each emotion probe against 20 randomly drawn vectors from the top-k eigenspace of the layer-40 covariance matrix, residual autocorrelation averages +0.077 in Cogito (p = 1.5e-27, 157/171 probes positive) and +0.170 in Kimi (p = 6.7e-30, 167/171 positive). The paper introduces agentic self-evaluation — a method in which Kimi K2.5 uses a real-time steering tool on its own SAE features and rates the emotional valence of what it experiences — and finds that self-reported emotionality of SAE features correlates with persistence above variance-matched controls (ρ = +0.124, p = 0.0001), replicating the probe-based result without sharing its potential confounds. SAE features whose direction overlaps more with the 171-dimensional emotion subspace are also more persistent (Spearman +0.413, p = 4.4e-196 in Cogito). The paper argues this implies that LLMs maintain something analogous to lingering affective states — not merely local semantic activation — and that agentic self-steering may offer a scalable route to interpreting internal representations beyond what passive probing methods can detect.
More papers — OpenAlex / S2
Affiliations (1)
- Anima Labs(institute)
Co-authors (4)
- Antra Tessera2 shared
- janus2 shared
- cube_flipper1 shared
- Scott Sauers1 shared
Other inbound relations (1)
- mentionsJanus Information Flow Transformers 2025(paper)
Recent mentions (2)
- papers-typedanima-labs-phenomenology-pt1.md
- papers-typedjanus-information-flow-transformers-2025.md