concept
active
concept:natural-distribution-of-representationsNatural Distribution of Representations
The distribution of latent representations produced by the model under unperturbed inputs
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Representational Divergenceassociated_withCore phenomenon studied: when causal interventions shift internal representations away from the natural distribution
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The central question of whether representational geometry implies corresponding computational structure
- Idea that information is spread across many neurons; superposition is a subtype.
- The idea that features are encoded as directions in activation space.
- Representations of one's own mental states; associated with consciousness in higher-order theories.
- Representations where individual neurons play multiple conceptual roles; patterns consisting of linear combinations of unit vectors.
- Probability distribution over discrete states or outcomes.
- Core contribution: the impasse where lifting linearity in alignment maps makes causal abstraction vacuous, but keeping it may miss non-linearly encoded features
- The evolution of an agent's latent representations over the course of training, shown to align with reward improvement when causal emergence is high.