method
active
method:5-token-steering-pulse-experiment

5-Token Steering Pulse Experiment

Applies a 5-token steering pulse to each emotion probe and measures persistence of causal effect via contrast z-score over 200 subsequent tokens

Neighborhood — ranked by edge-count

Findings (2)

finding

Concepts (1)

concept
  • The key-value cache from steered tokens is retained during no-steering continuation, allowing causal effect of steering to propagate

Methods (2)

method
  • Causal intervention: applying a 5-token steering pulse at the start of a model turn to measure downstream persistence of emotion feature activation
  • Per-(emotion, token) z-score computed as injected emotion activation minus mean of 170 other probes, contrasted against no-steering baseline

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.