PCA Analysis of Token Embeddings/Unembeddings

PCA applied to token embedding and unembedding matrices to understand what fraction of residual stream dimensions they occupy and how they relate

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

PCA analysis shows token embeddings and unembeddings are concentrated in a relatively small fraction of residual stream dimensions in large modelsfinding0.860
Supporting evidence for the claim that most residual stream dimensions are free for other layers to use
Principal components analysis (PCA)method0.767
Statistical method used to analyze neural activity data.
Token embeddingsconcept0.760
Vector representations of individual tokens from genomic foundation models; the raw inputs to sequence pooling methods.
span embedding analysismethod0.755
Extracting embeddings from instruction and example spans.
PCA Visualizationmethod0.724
Used to visually inspect separation of truth-related directions in model activation space across layers
PCA of Emotion Feature Activationsmethod0.713
PCA on 171 emotion probe activations across all tokens to produce ordered linear combinations and test if lower PCs are more persistent
Principal Component Analysis Visualizationmethod0.707
Used to visualize LLM true/false representations, revealing clear linear structure separating true from false statements
span embeddings extractionmethod0.706
Obtain instruction and example span embeddings at layer L* with chosen pooling.