concept
active
concept:convex-hull-of-class-representationsConvex Hull of Class Representations
The convex interpolation region of natural representations for a given class; used to bound harmless intervention regions
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The idea that features are encoded as directions in activation space.
- The hypothesis that models internalize concepts as approximately linear directions in representation space; used to interpret MDS injection behavior
- The central object of study — the idea that a concept like truth is encoded as a direction in the LLM's latent space
- The latent activations or embeddings inside a neural network.
- Investigation of whether a distributed representation can be further decomposed into sub-representations encoding component identities.
- Hypothesis that information may be encoded in arbitrary non-linear subspaces of a neural network
- Reports phase-like breakpoints and geometry changes as context scales; UCCT provides measurable predictor