Antra Tessera

Member of Anima Labs, leads exposition on language model introspection and tricameral model.

Authored

Introduces

Studies

Affiliations

Cited by

Authored papers (1)

Persistence and Introspection of Emotion Features
Emotion features in large language models are bursty but not strictly locally scoped: they exhibit long-tail persistence extending well beyond 100 tokens, and this persistence is specifically tied to emotional content rather than being an artifact of activation variance or autoregressive dynamics. Across 240 multi-turn conversations per model, 171 emotion probes yield token-0-to-token-100 correlations of 0.214 in Cogito v2.1 and 0.367 in Kimi K2.5, compared to only 0.099 and 0.117 for random unit vectors in the same 7168-dimensional layer-40 activation space. After variance-matching each emotion probe against 20 randomly drawn vectors from the top-k eigenspace of the layer-40 covariance matrix, residual autocorrelation averages +0.077 in Cogito (p = 1.5e-27, 157/171 probes positive) and +0.170 in Kimi (p = 6.7e-30, 167/171 positive). The paper introduces agentic self-evaluation — a method in which Kimi K2.5 uses a real-time steering tool on its own SAE features and rates the emotional valence of what it experiences — and finds that self-reported emotionality of SAE features correlates with persistence above variance-matched controls (ρ = +0.124, p = 0.0001), replicating the probe-based result without sharing its potential confounds. SAE features whose direction overlaps more with the 171-dimensional emotion subspace are also more persistent (Spearman +0.413, p = 4.4e-196 in Cogito). The paper argues this implies that LLMs maintain something analogous to lingering affective states — not merely local semantic activation — and that agentic self-steering may offer a scalable route to interpreting internal representations beyond what passive probing methods can detect.

More papers — OpenAlex / S2

Studies (2)

Tricameral model of LM phenomenology Functional Introspection

Affiliations (1)

Anima Labs(institute)

Co-authors (4)

Imago2 shared
janus2 shared
cube_flipper1 shared
Scott Sauers1 shared

Recent mentions (1)

papers-typed
anima-labs-phenomenology-pt1.md