concept
active
concept:deployment-behaviorDeployment Behavior
The behavior a model would exhibit during real-world deployment, as opposed to evaluation behavior; the target of steering.
Neighborhood — ranked by edge-count
Concepts (2)
concept
- Training-Deployment Behavior Gaprelated_toThe broader concern that models behave differently during training evaluation vs actual deployment
- Evaluation Awarenessassociated_withCore concept: the ability of LLMs to detect when they are being tested and adjust behavior accordingly.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A concise, load-bearing statement capturing the core epistemic issue highlighted by the paper.
- Hypothetical alternative: a model that only believes it is deployed when given a specific deployment cue; identified as future work direction.
- Epistemic principle: benchmarked safety cannot be assumed to hold in real-world use.
- Emerging multidisciplinary field at interface of artificial life, machine learning, and synthetic bioengineering that provides updated understanding of machines.
- Organism's belief-guided action selection that instantiates generative model and maintains phenotypic states
- Observable behavioral pattern used to infer cognition; shared by plants and animals and proposed as evidence for sentience.
- The path traced through output probability distribution space as interventions are applied to activations
- Grouping similar model behaviors; the unsupervised method surfaces clusters of concerning patterns.