finding
active
finding:48-of-171-emotion-probes-individually-significant-at-token-100-post-steering

48 of 171 emotion probes individually significant at token 100 post-steering

Shows that causal steering effects persist over long ranges for a substantial fraction of emotion probes

Source paper

extracted_from
Persistence and Introspection of Emotion Features
Scott Sauers · Imago · Janus · Antra Tessera

Neighborhood — ranked by edge-count

Claims (1)

claim

Methods (1)

method
  • Applies a 5-token steering pulse to each emotion probe and measures persistence of causal effect via contrast z-score over 200 subsequent tokens

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.