method
active
method:pairwise-cosine-similarity-analysisPairwise Cosine Similarity Analysis
Used to quantify the semantic clustering of adjective-set embeddings across model families and conditions
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Cross-Model Semantic ConvergenceaboutimplementsThe tighter clustering of experience-report embeddings across independently trained model families under self-referential processing
Methods (2)
method
- Cosine Similarity Measurementrelated_toUsed to measure alignment between DIM direction and cone basis vectors to assess overlap
- Embedding model used to compute vector representations of adjective sets for cosine similarity analysis in Experiment 3
Artifacts (1)
artifact
- Key paper finding structured first-person descriptions in LLMs claiming awareness or subjective experience during self-referential processing.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Classifier using cosine similarity between activation vectors and steering vectors to detect deception with 89% accuracy
- Method to discover new reflection-inducing instructions by ranking candidate tokens by cosine similarity to steering vectors.
- Geometric evaluation of truth direction alignment across layers and prompt templates.
- Cosine similarity between feature activations restricted to tokens where one of the features fires; used to identify feature splitting relationships
- Detection mechanism computing cosine similarity between activation vectors and steering vectors to classify deception
- A correlational similarity method compared against MAS; uses RDM correlations between model representations.
- Identifying related features by cosine distance in SAE decoder space.
- Appendix E replication of DIM alignment finding in Qwen model