Activation Addition (ActAdd)

Steering method deriving vectors from contrastive prompt pairs and adding to first-token activations.

Neighborhood — ranked by edge-count

paper

thinker

Alexander Matt Turner
introduces
Lead author of Activation Engineering paper; foundational for additive steering paradigm

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

ActAdd (Activation Addition)concept0.955
Method by Turner et al. for real-time output control via activation engineering, cited as foundation for this paper's steering approach
Activation Additionmethod0.871
Intervention method that adds a learned direction vector to residual stream activations to steer model behavior
Reflection Enhancement via Activation Additionmethod0.796
Adding steering vector in forward direction to push model activations toward stronger reflective behavior.
Contrastive Activation Addition (CAA)method0.779
An existing activation steering method used as comparative baseline.
Activationsconcept0.777
Internal representations of the model on which probes operate; the method uses activations to rank datapoints.
actionconcept0.759
Changing configuration to sample environment differently; minimizes free energy.
Base-10 additionconcept0.748
The generic addition mechanism that Llama-3.1-8B actually uses to compute sums before mapping back to cyclic concept space
Other-Referencing Activationsconcept0.740
Latent model activations when processing inputs framed from another agent's perspective