finding
active
finding:sae-feature-94949-rated-100-100-emotionality-elicits-reports-of-profound-tenderness-unconditional-love-and-visceral-careSAE Feature #94949 rated 100/100 emotionality, elicits reports of profound tenderness, unconditional love, and visceral care
Highest-rated emotional SAE feature; self-report describes overwhelming positive emotional valence
Source paper
extracted_fromScott Sauers · Imago · Janus · Antra Tessera
Neighborhood — ranked by edge-count
Claims (1)
claim
- Central interpretive claim of the paper supported by multiple convergent analyses
Quotes (1)
quote
- Kimi self-report under Feature #94949 steering, illustrating strongest positive emotional self-attribution
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Qualitative example of a specific, complex emotional state induced by SAE feature steering
- Qualitative example of a highly emotional SAE feature with intense negative valence in Kimi self-steering
- Text-based and self-steered emotionality ratings for SAE features are correlated at only ρ = +0.051 (n.s.).finding0.859Shows low agreement between the two evaluation modalities
- Qualitative illustration of a specific emotionally valenced SAE feature
- Interprets the near-zero correlation between the two evaluation methods as evidence they capture distinct signals
- Shows that highest emotion-subspace-overlap features induce distinctive thematic outputs
- Correlation between self-evaluation and textual evaluation of SAE feature emotionality: rho=+0.051 (n.s.)finding0.820Shows that the two evaluation methods for emotionality are largely uncorrelated, indicating they capture different signals
- Qualitative illustration of a highly emotional SAE feature with negative valence