Causally Relevant Latent Subspace

Contiguous subspace of the aligned latent vector encoding behaviorally relevant information for a specific causal variable.

Neighborhood — ranked by edge-count

framework

Model Alignment Search (MAS)
associated_with
The primary contribution of the paper: a bidirectional causal method that learns rotation matrices for each model to uncover and compare causally relevant latent subspaces across neural networks.
Distributed Alignment Search (DAS)
associated_with
Practical method by Geiger et al. for finding distributed causal abstractions using gradient descent

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Behaviorally Binary Subspaceconcept0.774
A vector subspace that causally impacts outputs only through the sign of its values, enabling harmless magnitude divergence
Subspace Interventionconcept0.762
Intervention targeting specific dimensional subsets of activation vectors rather than full representations
Latent-Space Representationsconcept0.761
Substrate on which causal emergence was computed across agent lifetimes; aligned with learning success.
latent reasoningconcept0.757
Reasoning approach using learnable hidden embeddings.
Latent Variables in Distributed Abstractionconcept0.752
Output of alignment map ϕ applied to DNN hidden states; basis for distributed causal abstraction
Causal emergence provides new perspectives for causal representation learning, interpreting latent variables as emergent causalities.claim0.750
Cross-fertilization claim made in discussion.
Causal Mediationconcept0.747
Whether an internal direction causally controls a target behavior, verified by intervention success
Emotion Subspaceconcept0.747
The subspace of activation space spanned by the 171 orthogonalized emotion probe vectors, used to measure SAE feature emotional alignment