finding
active
finding:when-a-20-questions-dialogue-agent-is-asked-to-regenerate-its-reveal-answer-it-sometimes-names-an-entirely-different-object-consistent-with-its-prior-answers-demonstrating-superposition-rather-than-commitmentWhen a 20-questions dialogue agent is asked to regenerate its 'reveal' answer, it sometimes names an entirely different object consistent with its prior answers, demonstrating superposition rather than commitment
Empirical illustration supporting the superposition of simulacra framework via the 20-questions analogy
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- The more nuanced second metaphor: LLM as simulator maintaining a superposition of possible simulacra across a multiverse of characters
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Conditional prediction about how a well-informed dialogue agent would handle questions of personal identity
- Operationalised question about self-preservation behaviour in dialogue agents
- Philosophical question about identity criteria for disembodied computational agents under threat
- Explanation of how knowledge (not just parameters) is shared between agents; links to pre-Cartesian consciousness
- Claim about the difficulty of responsiveness verification.
- Foundational claim of the paper, defining self-evidencing.
- Empirically grounded claim citing Perez et al. 2022, showing RLHF can backfire on the self-preservation dimension
- Core definitional quote for performative chain-of-thought