concept
active
concept:introspective-awareness

Introspective awareness

The central concept: the ability of a model to access and report on its internal states, as defined by the paper's criteria.

Neighborhood — ranked by edge-count

Thinkers (1)

thinker

Claims (4)

claim

Concepts (2)

concept
  • Introspection
    related_to
    The ability of a model to observe its own past internal states or computations; claimed to be architecturally permitted by transformers.
  • Technique of injecting activation patterns associated with specific concepts into a model's internal states to test whether self-reports reflect ground truth.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.