method
active
method:contrastive-feature-retrieval-pipelineContrastive Feature Retrieval Pipeline
A pipeline employing controlled semantic oppositions to distill monosemantic functional features from sparse activation spaces.
Neighborhood — ranked by edge-count
Papers (1)
paper
Frameworks (1)
framework
- The main framework proposed for retrieving and steering high-order semantic features in LLMs via sparse autoencoders.
Concepts (1)
concept
- Technique using semantically opposite prompts to contrast and identify features.
Methods (2)
method
- Validation method that uses text generation to confirm semantic control.
- Component of the contrastive retrieval pipeline analyzing activation statistics.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Method for obtaining concept vectors by subtracting activations from two contrasting prompts.
- Unsupervised probing method from Burns et al. 2023 that identifies directions along which contrast pair representations are far apart
- Method comparing brain activity in conscious vs. unconscious conditions.
- Supervised learning framework where system learns by observing contrast between current response and nudged improved response; requires weak additional forces from supervisor
- The property that living structures contain intense contrast—far more than one imagines helpful; true opposites which annihilate each other when superimposed, creating differentiation that gives birth to something; contrast unifies rather than separates when used correctly
- Probe construction method: concept vector at each layer is L2-normalized difference between mean positive and mean negative representations from contrastive system prompts
- UCCT interprets RAG as an anchoring variant that raises effective cohesion ρd
- Property that features activate on only a small fraction of inputs; enables compressed sensing and is what allows superposition to work