LLM Meta-Cognition

The ability of LLMs to monitor and evaluate their own reasoning, closely related to reflection.

Neighborhood — ranked by edge-count

concept

Reflection in LLMs
associated_with
The core phenomenon studied: the ability of LLMs to evaluate and revise their own reasoning.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

LLM psychosisconcept0.789
Tendency for models to get lost in roleplay or doom spirals, mitigated by expanded awareness.
LLM Self-Correctionconcept0.780
Related capability where LLMs correct their own outputs, studied via linear representations.
Linear Representation of Concepts in LLMsconcept0.780
The finding that interpretable concepts including character traits are encoded as linear directions in transformer residual streams
Hallucination in LLMsconcept0.771
Problem cited as a shortcoming of current LLMs; PRH predicts hallucinations should decrease with scale
LLM-Judge Data Attributionmethod0.767
Alternative data attribution approach using an LLM as a judge; compared against the probe-based method.
LLM Internal Representationsconcept0.765
High-dimensional vectors produced at each transformer layer for each input token; the primary substrate analyzed in this study.
Inner monologue / chain-of-thought in LLMsconcept0.765
The hidden reasoning steps generated by recent LLMs before visible output; mentioned in the technology section.
LLM representations exhibit intriguing patterns under spatio-permutational analyses, suggesting a potentially profound yet tentative indication of consciousness.claim0.762
Qualified positive claim from spatio permutation analysis where two cases satisfy all three criteria.