concept
active
concept:representational-embedding-spacesRepresentational Embedding Spaces
Internal structure of AI systems that CIMC proposes to analyze interpretively to evaluate consciousness hypotheses
Neighborhood — ranked by edge-count
Methods (1)
method
- Bricken et al.'s method for decomposing language models into interpretable features; cited as AI alignment interpretability relevant to consciousness detection
Concepts (1)
concept
- Interpretive ValidationsupportsCIMC's methodology for evaluating whether a built system is conscious: combining multiple forms of evidence including predicted functional organization and developmental trajectories
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The specific type of representation studied in the paper: function f: X→R^n assigning feature vectors to inputs
- Substrate on which causal emergence was computed across agent lifetimes; aligned with learning success.
- Property of conscious representations: they do not contain information about the fact that they are representations at the level of the representation itself
- CIMC's characterization of part of the solution to the Hard Problem: insight into the structural necessities of phenomenal representation
- The evolution of an agent's latent representations over the course of training, shown to align with reward improvement when causal emergence is high.
- The central question of whether representational geometry implies corresponding computational structure
- Mathematical structure central to distributed interchange interventions; representation space decomposed into orthogonal subspaces each aligned with a high-level variable.
- One-dimensional curved surface in internal activation space; the paper demonstrates alignment with behavior manifold.