finding

active

finding:sae-emotion-subspace-overlap-correlates-with-variance-residualized-persistence-in-cogito-spearman-0-413-p-4-4e-196

SAE emotion subspace overlap correlates with variance-residualized persistence in Cogito: Spearman +0.413, p = 4.4e-196.

Strong positive relationship between emotion alignment and SAE feature persistence in Cogito

Source paper

extracted_from

Persistence and Introspection of Emotion Features

Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Claims (1)

claim

SAE features that the model self-describes as more emotional tend to be more persistent than variance-matched SAE features.
supports
Novel finding that agentic self-evaluation of emotionality correlates with feature persistence

Findings (1)

finding

SAE feature emotion subspace overlap correlates with persistence in Cogito: Spearman +0.413, p=4.4e-196
restates
Demonstrates that SAE features more aligned with the emotion subspace are more persistent in Cogito after variance control

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Emotion probe persistence correlation of 0.214 in Cogito v2.1 vs 0.099 for random vectorsfinding0.830
Quantifies emotion feature persistence above random baseline in Cogito across 240 multi-turn conversations
Self-evaluated emotionality of SAE features negatively correlates with activation variance explained (ρ = -0.184, p = 4.6e-09), requiring variance correction to reveal the persistence signal.finding0.829
Explains why variance correction is needed to see the self-evaluation–persistence relationship
In Cogito v2.1, average residual persistence above variance-matched probes is +0.077 (p = 1.5e-27, 157 of 171 probes positive).finding0.826
Demonstrates emotion-specific persistence beyond variance effects in Cogito
Cogito emotion probe residual autocorrelation +0.077 above variance-matched controls (p=1.5e-27, 157/171 probes positive)finding0.825
Demonstrates that Cogito emotion probes are persistently active beyond what is explained by their variance alone
Emotion probe persistence (token-0 to token-100 correlation) in Cogito v2.1 is 0.214, compared to 0.099 for random unit vectors in 7168D space.finding0.824
Quantitative measure of emotion feature persistence vs random baseline in Cogito
Negative correlation between self-evaluated emotion persistence and SAE feature activation variance explained: rho=-0.184, p=4.6e-09finding0.824
Shows self-evaluated emotionality is negatively confounded by variance, requiring variance control to reveal the true signal
Agentic self-evaluation of SAE feature emotionality correlates with residual persistence: ρ = +0.124, p = 0.0001 in Kimi K2.5.finding0.823
Shows that model self-report of emotion predicts long-range feature persistence
SAE Feature Emotion Subspace Overlap Metricmethod0.809
Fraction of an SAE feature's length lying inside the 171-dimensional subspace spanned by emotion probes, computed via SVD orthogonalization

Restated by (1)

cosine ≥ 0.90

Other entities that say roughly the same thing. May be merge candidates or independent restatements across papers.

finding
SAE feature emotion subspace overlap correlates with persistence in Cogito: Spearman +0.413, p=4.4e-196