paper
active
paper:aranguri-bloom-verbalized-eval-awareness-2026

Verbalized Eval Awareness Inflates Measured Safety

/Users/antonborzov/Documents/Research.nosync/papers/aranguri-bloom-verbalized-eval-awareness-2026.md

External IDs

title_hash
3dc43b51d2a9dcd21c41db6f57472620abe70c0c
legacy_slug
aranguri-bloom-verbalized-eval-awareness-2026
Frontmatter (12 fields)
{
  "url": "https://www.goodfire.ai/research/verbalized-eval-awareness-inflates-measured-safety",
  "tags": [
    "eval-awareness",
    "safety-benchmarks",
    "chain-of-thought",
    "alignment",
    "applied-research",
    "goodfire"
  ],
  "year": 2026,
  "saved": "2026-05-14",
  "title": "Verbalized Eval Awareness Inflates Measured Safety",
  "venue": "Goodfire research post",
  "status": "summary-only",
  "authors": [
    "Santiago Aranguri",
    "Joseph Bloom"
  ],
  "dataset": "https://aranguri.github.io/eval_awareness/demo/",
  "published": "2026-05-04",
  "enrichment": {
    "is_stale": true
  },
  "affiliation": "Goodfire + UK AISI"
}

Outgoing (5)

Associated with (2)

Implements (1)

Member of (1)

Incoming (2)

Authored by (2)

Mentions (2)

  • papers
    /Users/antonborzov/Documents/Research.nosync/papers/aranguri-bloom-verbalized-eval-awareness-2026.md
  • papers
    aranguri-bloom-verbalized-eval-awareness-2026.md