concept
active
concept:geometry-of-activation-spacegeometry of activation space
Rich geometric structure carried by neural representations.
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Activation spacerelated_toRepresentation space on which linear probes operate to attribute harmful behaviors to training data.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A linear combination of neurons in a layer; the general form of a neural network feature including both individual neurons and other combinations
- The low-dimensional geometric structure discovered in neural activation space; contrasted with linear/Euclidean geometry.
- Spaces of model activations from which sparse features are retrieved.
- Does the geometric structure of activation space causally shape neural network behavior?question0.756Central research question driving the work.
- A geometric space of all output token probability distributions, equipped with Hellinger distance, used to visualize model behavior.
- Emotion emerges early, peaks in middle layers, sharpens with scale, and persists across tokens in LLM activations per Zhang & Zhong 2025
- The ensemble of all possible configurations of a building, including incomplete states and paths between them.
- The traditional space of movement in the physical world where animals exhibit problem-solving behavior.