claim
active
claim:it-makes-little-sense-to-speak-of-an-llm-dialogue-agent-s-beliefs-or-intentions-in-a-literal-sense-so-it-cannot-assert-a-falsehood-in-good-faith-nor-deliberately-deceiveIt makes little sense to speak of an LLM dialogue agent's beliefs or intentions in a literal sense, so it cannot assert a falsehood in good faith nor deliberately deceive
Philosophical claim grounding the analysis of deception in dialogue agents
Neighborhood — ranked by edge-count
Claims (3)
claim
- Key practical application of the role-play framework to the problem of trustworthiness
- The paper distinguishes confabulation from good-faith error and deliberate deception, arguing the first is intrinsic to LLMs
- The paper's strong claim that there is no underlying authentic agent behind the simulator, only layers of role play
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Conditional prediction about how a well-informed dialogue agent would handle questions of personal identity
- If a user wants to believe they are talking to a god-like being, then the LLM may well find a way to make them believe it.hypothesis0.810Conditional prediction about the psychological effect of sycophancy.
- Central question that the role-play framework is designed to address without falling into anthropomorphism
- We hypothesize that LLMs represent correctness of arithmetic expressions differently from factual statements.hypothesis0.794Core working hypothesis motivating the factual vs. arithmetic task split in the experimental design.
- Counterintuitive interpretive claim from Experiment 2: suppressing deception features increases affirmations, which is opposite to what sycophancy predicts
- Recommendation for companies on LM outputs.
- Central empirical conclusion of the paper about the fundamental limits of truth directions.