finding
active
finding:lower-more-central-pcs-of-emotion-feature-activations-are-more-persistent-than-higher-rank-noisier-pcs-in-both-kimi-and-cogito-above-variance-matched-baselinesLower (more central) PCs of emotion feature activations are more persistent than higher-rank (noisier) PCs in both Kimi and Cogito, above variance-matched baselines.
Supports that persistence is genuinely tied to emotion structure rather than measurement artifact
Source paper
extracted_fromScott Sauers · Imago · Janus · Antra Tessera
Neighborhood — ranked by edge-count
Claims (1)
claim
- Core empirical claim distinguishing emotion persistence from generic high-variance probe persistence
Hypotheses (1)
hypothesis
- Falsifiability test built into the PC analysis design
Findings (1)
finding
- Rules out that persistence is an artifact of probe construction, since noise dimensions are not similarly persistent
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Rules out measurement artifact explanation for the persistence finding
- Emotion probe persistence correlation of 0.214 in Cogito v2.1 vs 0.099 for random vectorsfinding0.802Quantifies emotion feature persistence above random baseline in Cogito across 240 multi-turn conversations
- Quantitative measure of emotion feature persistence vs random baseline in Cogito
- Demonstrates that Cogito emotion probes are persistently active beyond what is explained by their variance alone
- Central interpretive claim of the paper supported by multiple convergent analyses
- SAE feature emotion subspace overlap correlates with persistence in Cogito: Spearman +0.413, p=4.4e-196finding0.780Demonstrates that SAE features more aligned with the emotion subspace are more persistent in Cogito after variance control
- Strong positive relationship between emotion alignment and SAE feature persistence in Cogito
- Shows interpretability correlates with activation strength, most model effect comes from high activations
Restated by (1)
cosine ≥ 0.90Other entities that say roughly the same thing. May be merge candidates or independent restatements across papers.