method
active
method:linear-artificial-tomography-lat

Linear Artificial Tomography (LAT)

Method for extracting deception steering vectors via PCA on contrastive activation differences; achieves 89% detection accuracy

Neighborhood — ranked by edge-count

Thinkers (1)

thinker
  • Zou et al.
    introduces
    Introduced LAT for deception detection via PCA on neural activations; central method adopted by this paper

Concepts (3)

concept
  • Proposed pathway flowing through layers at each position; calculates K/V values that feed horizontal information flow.
  • A method for modifying model behavior by adding perturbation vectors to activations, used here to try to reduce eval awareness.
  • LAT methodology step constructing paired prompts that elicit divergent behaviors to extract steering vectors

Methods (1)

method

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.