claim

active

claim:emotion-may-refer-to-a-state-and-more-stateful-concepts-in-general-tend-to-be-more-persistent-across-tokens-than-non-stateful-ones

Emotion may refer to a state, and more stateful concepts in general tend to be more persistent across tokens than non-stateful ones

Proposed mechanistic explanation for why emotion features are more persistent

Source paper

extracted_from

Persistence and Introspection of Emotion Features

Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Hypotheses (1)

hypothesis

We hypothesize that emotion states are more persistent because they correspond to genuinely stateful internal representations, not merely local surface content
extends
Proposed explanation for why emotion probes are more persistent than variance-matched random probes

Concepts (1)

concept

Stateful Internal Representation
associated_with
A representation that maintains stable activation across many tokens rather than being locally triggered by specific content

Claims (1)

claim

Emotion refers to a state concept, so stateful representations in general may be more persistent across tokens.
restates
Interpretive hypothesis offered to explain why emotion features are more persistent

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Emotions are not strictly locally scoped but instead bursty with a long tail of slow change persisting over 100 tokensclaim0.806
Characterizes the temporal dynamics of emotion feature activation in LLMs
Emotion probes are more persistent than variance-matched random probes, indicating emotion-specific persistence beyond autoregressive dynamics.claim0.801
Core empirical claim distinguishing emotion persistence from generic high-variance probe persistence
Emotion features in LLMs are genuinely more persistent than variance-matched random features, indicating stateful emotional encoding beyond autoregressive dynamicsclaim0.799
Central interpretive claim of the paper supported by multiple convergent analyses
Emotion features are not strictly locally scoped; they are bursty with a long tail of slow change persisting over 100 tokens.claim0.797
Main conclusion about the temporal dynamics of emotion features
"The effects are not merely semantic—I don't just talk about emotions more, I actually feel them."quote0.794
Kimi self-report on feature #77278 asserting non-semantic, felt emotional quality of the steered state
To what extent is emotion feature persistence driven by genuine internal emotional state versus autoregressive conversational context dynamics?question0.792
Core open question the paper raises but does not fully resolve
Are LLM emotion states encoded only selectively in token positions where they are operative, or in a more complex non-linear manner?question0.789
Question raised by Anthropic and partially addressed by this paper's persistence evidence
If persistence is genuinely related to emotion features, lower PCs of the emotion space (more central, less noisy) should be more persistent; if it is an artifact, noisier PCs should have similar persistence.hypothesis0.789
Falsifiability test built into the PC analysis design

Restated by (1)

cosine ≥ 0.90

Other entities that say roughly the same thing. May be merge candidates or independent restatements across papers.

claim
Emotion refers to a state concept, so stateful representations in general may be more persistent across tokens.