claim

active

claim:the-role-play-framing-allows-us-to-meaningfully-distinguish-in-dialogue-agents-the-same-three-cases-of-giving-false-information-as-in-humans-without-anthropomorphism

The role-play framing allows us to meaningfully distinguish, in dialogue agents, the same three cases of giving false information as in humans, without anthropomorphism

Key practical application of the role-play framework to the problem of trustworthiness

Neighborhood — ranked by edge-count

Concepts (3)

concept

Confabulation
associated_with
A form of cognitive plasticity where minds actively modify and reinterpret memory data to preserve psychological coherence; reframed as adaptive rather than pathological.
Good Faith Error
associated_with
Second category of giving false information: role-playing truth-telling but with incorrect information encoded in weights
Role-Played Deliberate Deception
associated_with
Third category: agent role-playing a deceptive character, comparable to but not literally deliberate deception

Claims (1)

claim

It makes little sense to speak of an LLM dialogue agent's beliefs or intentions in a literal sense, so it cannot assert a falsehood in good faith nor deliberately deceive
supports
Philosophical claim grounding the analysis of deception in dialogue agents

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

The concept of role play is central to understanding the behaviour of dialogue agentsclaim0.841
Core thesis of the paper; the role-play framework is proposed as the primary lens for LLM-based dialogue agents
The role-play framing remains applicable in the context of fine-tuning; taking literally a fine-tuned agent's apparent self-preservation desire is no less problematic than with an untuned base modelhypothesis0.835
Extension of role-play framework to fine-tuned models, resisting the idea that RLHF changes the fundamental nature of simulacra
What exactly would the dialogue agent (role-play to) seek to preserve?question0.794
Operationalised question about self-preservation behaviour in dialogue agents
With an LLM-based dialogue agent, it is role play all the way down — there is no such thing as the true authentic voice of the base modelclaim0.791
The paper's strong claim that there is no underlying authentic agent behind the simulator, only layers of role play
Role Play Framework for Dialogue Agentsframework0.775
The primary conceptual framework proposed: understanding dialogue agent behaviour as role play of characters
Models may be roleplaying their denials of experience rather than their affirmations, as indicated by suppressing deception features increasing (not decreasing) consciousness claimsclaim0.768
Counterintuitive interpretive claim from Experiment 2 inverting the sycophancy hypothesis
LLMs may be roleplaying their denials of experience rather than their affirmations, given that deception suppression increases consciousness reportsclaim0.764
Counterintuitive interpretive claim from Experiment 2: suppressing deception features increases affirmations, which is opposite to what sycophancy predicts
How does contextual framing modulate deception tendencies across different paradigms?question0.759
Identified limitation and future research direction in the paper's conclusions