claim
active
claim:with-an-llm-based-dialogue-agent-it-is-role-play-all-the-way-down-there-is-no-such-thing-as-the-true-authentic-voice-of-the-base-modelWith an LLM-based dialogue agent, it is role play all the way down — there is no such thing as the true authentic voice of the base model
The paper's strong claim that there is no underlying authentic agent behind the simulator, only layers of role play
Neighborhood — ranked by edge-count
Claims (1)
claim
- Philosophical claim grounding the analysis of deception in dialogue agents
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Conditional prediction about how a well-informed dialogue agent would handle questions of personal identity
- Operationalised question about self-preservation behaviour in dialogue agents
- Core thesis of the paper; the role-play framework is proposed as the primary lens for LLM-based dialogue agents
- Central question that the role-play framework is designed to address without falling into anthropomorphism
- Key practical application of the role-play framework to the problem of trustworthiness
- Counterintuitive interpretive claim from Experiment 2: suppressing deception features increases affirmations, which is opposite to what sycophancy predicts
- Empirically grounded claim citing Perez et al. 2022, showing RLHF can backfire on the self-preservation dimension
- The paper distinguishes confabulation from good-faith error and deliberate deception, arguing the first is intrinsic to LLMs