Emergent Introspective Awareness in LLMs

Lindsey 2026 paper finding that models can articulate content of injected activation patterns; supports claim about self-knowledge representations

Neighborhood — ranked by edge-count

paper

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

What are the mechanistic bases of introspective awareness in LLMs?question0.870
Secondary question; paper demonstrates introspection but explicitly avoids pinning down specific mechanistic explanation, noting mechanisms could be shallow and specialized.
Emergent Introspective Awareness in Large Language Models (Lindsey, 2025)concept0.846
Related work demonstrating LLM introspective capabilities with scale-dependent pattern paralleling ESR
Emergent Abilities of LLMsconcept0.845
Prior work documenting abrupt capability changes under scale; UCCT provides a measurable predictor for when they occur
Introspective awarenessconcept0.830
The central concept: the ability of a model to access and report on its internal states, as defined by the paper's criteria.
Emergent Introspective Awareness Framework (Lindsey 2026)framework0.816
Prior framework claiming frontier LLMs can detect and name injected concepts, interpreted as nascent self-awareness
LLM Introspective Self-Reportconcept0.810
The capacity of Kimi K2.5 to evaluate its own internal emotional state when steered, used as a novel interpretability signal
Emergence Of Awarenessconcept0.778
Introspective awareness correlates with overall model capabilityclaim0.777
Most capable models (Opus 4, 4.1) show greatest introspective awareness; trend suggests introspection aided by improvements in model intelligence.