concept
active
concept:model-internal-beliefModel Internal Belief
The latent representational state of a model's answer confidence as decoded from activations, distinct from what appears in generated text
Neighborhood — ranked by edge-count
Concepts (2)
concept
- Performative chain-of-thoughtassociated_withCentral concept: verbalized reasoning that occurs after the model has already internally settled on an answer, particularly on easier tasks.
- Activation ProbingaboutTechnique of reading out model beliefs from internal activations before the final answer token is generated
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The view that epistemic justification is fully determined by factors internal to the subject's mind, often linked to consciousness.
- The latent activations or embeddings inside a neural network.
- Core definitional quote for performative chain-of-thought
- The model's internal representation of uncertainty hypothesized to trigger self-reflection
- Key prior finding that LLMs can internally represent beliefs of self and others, motivating SOO approach
- Beliefs about states before data; used to transcribe task instructions into agent's generative model
- The inferred mechanism underlying ESR whereby the model tracks coherence of its own outputs
- The possibility of a stably encoded, causally active emotional state within LLMs, as distinct from token-by-token semantic content