method
active
method:principal-components-analysis-pcaPrincipal components analysis (PCA)
Statistical method used to analyze neural activity data.
Neighborhood — ranked by edge-count
Papers (1)
paper
Frameworks (2)
framework
- CausalGymusesMulti-task benchmark of linguistic behaviours for measuring causal efficacy of interpretability methods, adapted from SyntaxGym
- Novel construct introduced by this paper: a hypothetical graph embedded in the time series of LLM representations, where each dimension is a node and latent connections are edges.
Methods (2)
method
- Linear probes constructed to measure 171 emotion concepts in model activations with surface semantic content removed
- Method for extracting deception steering vectors via PCA on contrastive activation differences; achieves 89% detection accuracy
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Used to visualize LLM true/false representations, revealing clear linear structure separating true from false statements
- Used to visually inspect separation of truth-related directions in model activation space across layers
- Justifies PCA choice over UMAP or t-SNE for the node-structured RN model.
- PCA on 171 emotion probe activations across all tokens to produce ordered linear combinations and test if lower PCs are more persistent
- PCA applied to token embedding and unembedding matrices to understand what fraction of residual stream dimensions they occupy and how they relate
- Standardized PCA run on role vectors to find main axes of persona variation
- Method comparing brain activity in conscious vs. unconscious conditions.