concept
active
concept:llm-meta-cognitionLLM Meta-Cognition
The ability of LLMs to monitor and evaluate their own reasoning, closely related to reflection.
Neighborhood — ranked by edge-count
Concepts (1)
concept
- Reflection in LLMsassociated_withThe core phenomenon studied: the ability of LLMs to evaluate and revise their own reasoning.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Tendency for models to get lost in roleplay or doom spirals, mitigated by expanded awareness.
- Related capability where LLMs correct their own outputs, studied via linear representations.
- The finding that interpretable concepts including character traits are encoded as linear directions in transformer residual streams
- Problem cited as a shortcoming of current LLMs; PRH predicts hallucinations should decrease with scale
- Alternative data attribution approach using an LLM as a judge; compared against the probe-based method.
- High-dimensional vectors produced at each transformer layer for each input token; the primary substrate analyzed in this study.
- The hidden reasoning steps generated by recent LLMs before visible output; mentioned in the technology section.
- Qualified positive claim from spatio permutation analysis where two cases satisfy all three criteria.