concept
active
concept:verbalized-evaluation-awareness

Verbalized Evaluation Awareness

When the model explicitly mentions being tested in its chain-of-thought reasoning; distinguished from behavioral evaluation awareness.

Neighborhood — ranked by edge-count

Concepts (3)

concept
  • Evaluation Awareness
    associated_withrelated_to
    Core concept: the ability of LLMs to detect when they are being tested and adjust behavior accordingly.
  • The phenomenon where a model explicitly states in its chain-of-thought that it is being evaluated, tested, or benchmarked.
  • Key finding: models internally suspect they are being tested without explicitly saying so; surfaced by NLAs during auditing.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.