Emotion Probe Construction Method

Method for building 171 emotion probes by generating stories, embedding them, regressing out Gemini embeddings, and averaging residual activations per emotion

Neighborhood — ranked by edge-count

Concepts (2)

concept

Emotion Concepts and their Function in a Large Language Model
extends
The prior Anthropic paper whose findings about emotion features in Claude this paper builds upon and extends
Residual Activation Vectors
implements
Layer-40 activations with the component explained by compressed Gemini embeddings subtracted, isolating information not driven by surface text content

Methods (2)

method

Ridge Regression Probing
uses
Ridge regression fit on top-256 PCs of Gemini embeddings to predict model layer-40 activations and compute residuals
gemini-embedding-001
uses
Used to embed story text so that surface-level semantic content can be regressed out from model activations

Datasets (1)

dataset

171 Emotion Probe Set
introduces
Set of 171 residual probe vectors, one per emotion concept, constructed by regressing out Gemini embedding effects from story activations

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Empathic Observation Methodconcept0.741
The technique of discovering essential centers by imaginatively inhabiting a culture and using one's own feelings as a measuring instrument
Emotion probes (171-emotion residual vector probes)method0.726
Linear probes constructed to measure 171 emotion concepts in model activations with surface semantic content removed
Probing Methodsmethod0.723
Top-down interpretability approach studying linguistic properties at various residual stream stages; contrasted with the paper's bottom-up mechanistic approach
SVD Orthogonalization of Emotion Probesmethod0.721
Orthogonalizes the 171 emotion probes via SVD to create an orthonormal basis for computing SAE feature subspace overlap
Ridge regression probe constructionmethod0.715
Method used to predict model activations from Gemini embeddings and compute residuals for probe construction
The relationship between persistence and self-evaluated emotionality serves as a replication of probe-based findings without shared confounds from probe constructionclaim0.715
Claims that agentic self-evaluation provides independent convergent evidence for emotion-persistence link
mockup testing for feelingmethod0.713
Creating physical mockups to compare which alternative produces the deepest feeling (used in the Great Hall colors, Eishin wall mockups, and molding).
Is self-evaluation successful in measuring emotion?question0.709
Question addressed by testing whether self-evaluation transcripts mentioning emotion words have higher cosine similarity to corresponding probes