concept
active
concept:introspective-awarenessIntrospective awareness
The central concept: the ability of a model to access and report on its internal states, as defined by the paper's criteria.
Neighborhood — ranked by edge-count
Thinkers (1)
thinker
- Jack Lindseystudies
Claims (4)
claim
- The paper's central interpretive assertion.
- Based on layer-selective perturbation results.
- Assertion about the role of post-training in eliciting introspection.
- A caveat qualifying the main claim.
Concepts (2)
concept
- Introspectionrelated_toThe ability of a model to observe its own past internal states or computations; claimed to be architecturally permitted by transformers.
- Concept InjectionaboutTechnique of injecting activation patterns associated with specific concepts into a model's internal states to test whether self-reports reflect ground truth.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Formal definition requiring accuracy, grounding, internality, and metacognitive representation for genuine introspection in LLMs.
- The capacity to detect and report one's own internal states, measured via the five-adjective task and paradox reflection
- Lindsey 2026 paper finding that models can articulate content of injected activation patterns; supports claim about self-knowledge representations
- Most capable models (Opus 4, 4.1) show greatest introspective awareness; trend suggests introspection aided by improvements in model intelligence.
- Pearson-Vogel et al.'s finding that models can detect prior concept injections; introspective signals exist in middle layers suppressed by post-training
- Key gap identified in the literature; systematic self-examination processes for machine consciousness development.
- Prior framework claiming frontier LLMs can detect and name injected concepts, interpreted as nascent self-awareness
- Identified gap; methods for enabling machine consciousness development through self-examination.