finding
active
finding:in-every-model-benchmark-combination-tested-at-least-one-instance-of-verbalized-eval-awareness-was-detected

In every model × benchmark combination tested, at least one instance of verbalized eval awareness was detected

Coverage finding: 100% of the 19×8=152 combinations had explicit eval awareness, showing the phenomenon is widespread.

Source paper

extracted_from
Verbalized Eval Awareness Inflates Measured Safety
(2026) · Aranguri, Santiago · Bloom, Joseph

Neighborhood — ranked by edge-count

Communities (3)

community

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.