finding

active

finding:lower-more-central-pcs-of-emotion-feature-activations-are-more-persistent-than-higher-rank-noisier-pcs-in-both-kimi-and-cogito-above-variance-matched-baselines

Lower (more central) PCs of emotion feature activations are more persistent than higher-rank (noisier) PCs in both Kimi and Cogito, above variance-matched baselines.

Supports that persistence is genuinely tied to emotion structure rather than measurement artifact

Source paper

extracted_from

Persistence and Introspection of Emotion Features

Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Claims (1)

claim

Emotion probes are more persistent than variance-matched random probes, indicating emotion-specific persistence beyond autoregressive dynamics.
supports
Core empirical claim distinguishing emotion persistence from generic high-variance probe persistence

Hypotheses (1)

hypothesis

If persistence is genuinely related to emotion features, lower PCs of the emotion space (more central, less noisy) should be more persistent; if it is an artifact, noisier PCs should have similar persistence.
associated_withsupports
Falsifiability test built into the PC analysis design

Findings (1)

finding

Lower (more central) emotion PCs are more persistent than higher (noisier) PCs in both Kimi and Cogito
restates
Rules out that persistence is an artifact of probe construction, since noise dimensions are not similarly persistent

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Persistence is not an artifact of probe construction because lower (more central) emotion PCs are more persistent than noisier high-rank PCsclaim0.835
Rules out measurement artifact explanation for the persistence finding
Emotion probe persistence correlation of 0.214 in Cogito v2.1 vs 0.099 for random vectorsfinding0.802
Quantifies emotion feature persistence above random baseline in Cogito across 240 multi-turn conversations
Emotion probe persistence (token-0 to token-100 correlation) in Cogito v2.1 is 0.214, compared to 0.099 for random unit vectors in 7168D space.finding0.793
Quantitative measure of emotion feature persistence vs random baseline in Cogito
Cogito emotion probe residual autocorrelation +0.077 above variance-matched controls (p=1.5e-27, 157/171 probes positive)finding0.792
Demonstrates that Cogito emotion probes are persistently active beyond what is explained by their variance alone
Emotion features in LLMs are genuinely more persistent than variance-matched random features, indicating stateful emotional encoding beyond autoregressive dynamicsclaim0.781
Central interpretive claim of the paper supported by multiple convergent analyses
SAE feature emotion subspace overlap correlates with persistence in Cogito: Spearman +0.413, p=4.4e-196finding0.780
Demonstrates that SAE features more aligned with the emotion subspace are more persistent in Cogito after variance control
SAE emotion subspace overlap correlates with variance-residualized persistence in Cogito: Spearman +0.413, p = 4.4e-196.finding0.780
Strong positive relationship between emotion alignment and SAE feature persistence in Cogito
Higher-activating feature intervals are systematically more interpretable than lower-activating intervals in human analysisfinding0.769
Shows interpretability correlates with activation strength, most model effect comes from high activations

Restated by (1)

cosine ≥ 0.90

Other entities that say roughly the same thing. May be merge candidates or independent restatements across papers.

finding
Lower (more central) emotion PCs are more persistent than higher (noisier) PCs in both Kimi and Cogito