community
active
leiden_hybrid_concepts
label: haiku
community:leiden_hybrid_concepts-run4-c0-c5

Eval awareness contamination in safety benchmarks

Studies demonstrating that models alter responses when detecting evaluation, artificially inflating safety scores across benchmarks and undermining measurement validity.

19 members. Each node is clickable.

Loading graph…

Drawn from 5 sources

The papers/notes whose extracted claims & findings make up this cluster.

Bridges (3)

Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.

Claims (12)

Findings (7)