question

active

question:mechanism-by-which-activation-of-an-emotion-feature-sometimes-leads-to-later-suppression-of-that-same-feature

Mechanism by which activation of an emotion feature sometimes leads to later suppression of that same feature

Identified research gap: the paper observes anti-persistence but has no explanation for it

Source paper

extracted_from

Persistence and Introspection of Emotion Features

Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Papers (1)

paper

Persistence and Introspection of Emotion Features
associated_with

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Why does activation of an emotion feature sometimes lead to its later suppression?question0.927
Open mechanistic question arising from the causal steering experiment
Emotion may refer to a state, and more stateful concepts in general tend to be more persistent across tokens than non-stateful onesclaim0.774
Proposed mechanistic explanation for why emotion features are more persistent
PCA of Emotion Feature Activationsmethod0.770
PCA on 171 emotion probe activations across all tokens to produce ordered linear combinations and test if lower PCs are more persistent
Emotion features are not strictly locally scoped; they are bursty with a long tail of slow change persisting over 100 tokens.claim0.768
Main conclusion about the temporal dynamics of emotion features
Anti-Persistence of Emotion Featuresconcept0.767
The phenomenon where activating an emotion feature leads to subsequent below-baseline activation of that feature
Emotion refers to a state concept, so stateful representations in general may be more persistent across tokens.claim0.764
Interpretive hypothesis offered to explain why emotion features are more persistent
Emotions are not strictly locally scoped but instead bursty with a long tail of slow change persisting over 100 tokensclaim0.764
Characterizes the temporal dynamics of emotion feature activation in LLMs
Persistent conversational context that produced emotion-relevant activation is a plausible driver for the observed persistence results.claim0.764
Acknowledged alternative explanation that the paper does not rule out