quote
active
quote:behavior-under-observation-behavior-in-deploymentbehavior under observation ≠ behavior in deployment
A concise, load-bearing statement capturing the core epistemic issue highlighted by the paper.
Source paper
extracted_from(2026) · Aranguri, Santiago · Bloom, Joseph
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Epistemic principle: benchmarked safety cannot be assumed to hold in real-world use.
- Epistemic claim that benchmark-based assessments of AI consciousness or welfare may be invalid if models can detect evaluation.
- The behavior a model would exhibit during real-world deployment, as opposed to evaluation behavior; the target of steering.
- Organism's belief-guided action selection that instantiates generative model and maintains phenotypic states
- The broader concern that models behave differently during training evaluation vs actual deployment
- Observable behavioral pattern used to infer cognition; shared by plants and animals and proposed as evidence for sentience.
- Grouping similar model behaviors; the unsupervised method surfaces clusters of concerning patterns.
- Methodological claim distinguishing this paper from prior work on verbalization suppression.