concept
active
concept:contrastive-pairsContrastive Pairs
Pairs of prompts at different reflection levels used to compute steering vectors.
Neighborhood — ranked by edge-count
Papers (1)
paper
Methods (1)
method
- Method for computing steering vectors as mean activation differences between reflection levels at a given layer.
Concepts (1)
concept
- Contrast Pairsrelated_toPairs of statements with opposite truth values used as input to CCS; e.g., cities and neg_cities paired statements
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The property that living structures contain intense contrast—far more than one imagines helpful; true opposites which annihilate each other when superimposed, creating differentiation that gives birth to something; contrast unifies rather than separates when used correctly
- Method comparing brain activity in conscious vs. unconscious conditions.
- Experimental design where injection strengths are swapped between sentences in two parts of each trial to cancel positional preferences
- Supervised learning framework where system learns by observing contrast between current response and nudged improved response; requires weak additional forces from supervisor
- LAT methodology step constructing paired prompts that elicit divergent behaviors to extract steering vectors
- Technique for obtaining concept vectors by presenting model with two scenarios differing in one respect and subtracting activations to isolate conceptual difference.
- Unsupervised probing method from Burns et al. 2023 that identifies directions along which contrast pair representations are far apart
- The color property that colors are arranged in a spatial sequence of interacting pairs (like a chain of arrows), creating a gradient that points to and intensifies the main center.