method
active
method:input-embedding-similarity-baselineInput Embedding Similarity Baseline
Baseline method for instruction discovery using surface-level input embedding similarity instead of steering vectors.
Neighborhood — ranked by edge-count
Methods (1)
method
- Cosine Similarity Ranking for Instruction Discoveryassociated_withMethod to discover new reflection-inducing instructions by ranking candidate tokens by cosine similarity to steering vectors.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Demonstrates the failure mode of surface-level similarity for instruction discovery.
- Demonstrates that surface-level embedding similarity fails to capture reflective semantics.
- The specific type of representation studied in the paper: function f: X→R^n assigning feature vectors to inputs
- A reinforcing interlock between different materials, mentioned alongside Deep Interlock in West Dean construction.
- Model-independent feature comparison based on correlating activation vectors across a fixed diverse dataset
- Core theoretical claim about the target of representation learning
- Extracting embeddings from instruction and example spans.
- Similarity measured with respect to network behavior/function rather than statistical correlation of activations.