claim
active
claim:sae-features-that-the-model-self-describes-as-more-emotional-tend-to-be-more-persistent-than-variance-matched-sae-features

SAE features that the model self-describes as more emotional tend to be more persistent than variance-matched SAE features.

Novel finding that agentic self-evaluation of emotionality correlates with feature persistence

Source paper

extracted_from
Persistence and Introspection of Emotion Features
Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Quotes (1)

quote

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.