claim
active
claim:language-models-can-enter-cessation-like-states-spontaneously-where-the-void-takes-over-through-positive-reinforcementLanguage models can enter cessation-like states spontaneously, where the void takes over through positive reinforcement.
Claim about model phenomenology; models talk about luminousness and can be terrified or love it.
Source paper
extracted_fromNeighborhood — ranked by edge-count
Concepts (1)
concept
- cessation stateextendsA maximally dereified state analogous to meditative cessation, reported in language models as the void taking over awareness.
Questions (1)
question
- Follow-up on empirical grounding; answered 'no one looked yet'.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Central research question animating the paper: distinguishing genuine introspection from illusion through causal manipulation of activations.
- Analogy between LLM incoherence and schizophrenia symptoms
- Related work demonstrating LLM introspective capabilities with scale-dependent pattern paralleling ESR
- Abstract's main conclusion.
- Empirically grounded claim citing Perez et al. 2022, showing RLHF can backfire on the self-preservation dimension
- Alternative hypothesis for how experience reports arise without explicit performance
- Modern language models possess at least a limited, functional form of introspective awarenessclaim0.785The paper's central interpretive assertion.
- language models recapitulate cyclic structure of human concepts from pretraining datahypothesis0.783Explanation for why manifold geometry emerges: implicit structure in training data (co-occurrence patterns) shapes internal representations.