Pairwise Cosine Similarity Analysis

Used to quantify the semantic clustering of adjective-set embeddings across model families and conditions

Neighborhood — ranked by edge-count

concept

Cross-Model Semantic Convergence
aboutimplements
The tighter clustering of experience-report embeddings across independently trained model families under self-referential processing

method

Cosine Similarity Measurement
related_to
Used to measure alignment between DIM direction and cone basis vectors to assess overlap
text-embedding-3-large
uses
Embedding model used to compute vector representations of adjective sets for cosine similarity analysis in Experiment 3

artifact

Large Language Models Report Subjective Experience Under Self-Referential Processing
uses
Key paper finding structured first-person descriptions in LLMs claiming awareness or subjective experience during self-referential processing.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Cosine Similarity Binary Classifiermethod0.817
Classifier using cosine similarity between activation vectors and steering vectors to detect deception with 89% accuracy
Cosine Similarity Ranking for Instruction Discoverymethod0.803
Method to discover new reflection-inducing instructions by ranking candidate tokens by cosine similarity to steering vectors.
Cosine similarity between truth probesmethod0.781
Geometric evaluation of truth direction alignment across layers and prompt templates.
Masked Cosine Similaritymethod0.780
Cosine similarity between feature activations restricted to tokens where one of the features fires; used to identify feature splitting relationships
Cosine Similarity-Based Deception Detectionconcept0.774
Detection mechanism computing cosine similarity between activation vectors and steering vectors to classify deception
Representational Similarity Analysis (RSA)framework0.750
A correlational similarity method compared against MAS; uses RDM correlations between model representations.
Feature neighborhood exploration via cosine similarity of decoder weightsmethod0.746
Identifying related features by cosine distance in SAE decoder space.
In Qwen-2.5-9B, only v1 has meaningful cosine similarity to DIM direction; all additional basis vectors have cosine similarities ~1e-9finding0.730
Appendix E replication of DIM alignment finding in Qwen model