concept
active
concept:experiment-4-paradoxical-reasoning-and-state-transferExperiment 4: Paradoxical Reasoning and State Transfer
Tests whether self-referential processing state transfers to produce richer introspection on unrelated paradoxical reasoning tasks
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Set of 50 paradoxical prompts used in Experiment 4 to test whether self-referential state transfers to an unrelated behavioral domain
- 50 paradoxical prompts each ending with a reflection clause, measuring whether self-referential state transfers to downstream introspection
- Claim supported by Experiment 4: prior self-referential induction yields higher self-awareness scores on paradoxical reasoning where introspection is only indirectly afforded
- Explanation of how knowledge (not just parameters) is shared between agents; links to pre-Cartesian consciousness
- Critical finding showing steering vectors can produce unfaithful CoT where harmful choices are obscured in reasoning
- Key reference documenting Meta's CICERO using deception in Diplomacy despite cooperative design intent
- Cross-model consistency of the condition ordering in Experiment 4
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (Li et al., 2023)concept0.709Safety intervention that relies on activation modification, which ESR might undermine