concept
active
concept:activation-velocity

Activation velocity

Cumulative drift measure in internal representations across turns introduced by Das & Fioretto 2026

Neighborhood — ranked by edge-count

Thinkers (1)

thinker
  • Saswat Das
    introduces
    Introduced activation velocity measure for cumulative internal drift across conversation turns

Concepts (1)

concept
  • Behavioural drift in multi-turn LLM interaction; documented in prior work for persona, identity, and instruction-following

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Activationsconcept0.752
    Internal representations of the model on which probes operate; the method uses activations to rank datapoints.
  • Key capability: covariance pooling compresses gigabytes of activations into compact stable embeddings without large labeled datasets.
  • Causal intervention technique: edit NLA explanation, reconstruct via AR, use difference as steering vector to manipulate model behavior.
  • Activation spaceconcept0.721
    Representation space on which linear probes operate to attribute harmful behaviors to training data.
  • Activation Probingconcept0.719
    Technique of reading out model beliefs from internal activations before the final answer token is generated
  • Pearson correlation of feature activations across 40M tokens used to measure feature similarity and universality across models
  • Model-independent feature comparison based on correlating activation vectors across a fixed diverse dataset
  • Intervention method that adds a learned direction vector to residual stream activations to steer model behavior