concept
active
concept:hagendorff-2024-deception-abilities-emerged-in-large-language-models

Hagendorff 2024 - Deception abilities emerged in large language models

Source of the Bob burglar text scenario adapted for LLM deception testing in this paper

Neighborhood — ranked by edge-count

Methods (1)

method
  • Primary deception evaluation scenario where the model must choose to recommend a room to a burglar

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.