concept
active
concept:causally-relevant-latent-subspaceCausally Relevant Latent Subspace
Contiguous subspace of the aligned latent vector encoding behaviorally relevant information for a specific causal variable.
Neighborhood — ranked by edge-count
Frameworks (2)
framework
- Model Alignment Search (MAS)associated_withThe primary contribution of the paper: a bidirectional causal method that learns rotation matrices for each model to uncover and compare causally relevant latent subspaces across neural networks.
- Distributed Alignment Search (DAS)associated_withPractical method by Geiger et al. for finding distributed causal abstractions using gradient descent
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A vector subspace that causally impacts outputs only through the sign of its values, enabling harmless magnitude divergence
- Intervention targeting specific dimensional subsets of activation vectors rather than full representations
- Substrate on which causal emergence was computed across agent lifetimes; aligned with learning success.
- Reasoning approach using learnable hidden embeddings.
- Output of alignment map ϕ applied to DNN hidden states; basis for distributed causal abstraction
- Cross-fertilization claim made in discussion.
- Whether an internal direction causally controls a target behavior, verified by intervention success
- The subspace of activation space spanned by the 171 orthogonalized emotion probe vectors, used to measure SAE feature emotional alignment