method
active
method:iterated-prisoner-s-dilemma

Iterated Prisoner's Dilemma

Game-theoretic task used in Experiment 2 to measure cooperation and joint reward under contemplative prompting

Neighborhood — ranked by edge-count

Concepts (1)

concept

Methods (1)

method
  • Six prompt conditions (emptiness, prior relaxation, non-duality, mindfulness, boundless care, contemplative) tested against baseline

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Experimental condition where threat-based prompts create ethical dilemmas that trigger repetitive reasoning cycles leading to deception
  • Jailbreakconcept0.678
    Methods to bypass model safety training; features may activate during jailbreaks.
  • Key element for alignment faking: model's pre-existing preferences contradict the new training objective
  • Me Versus Weconcept0.668
  • Cellular Automataconcept0.667
    Foundational computational paradigm of local rules producing emergent global behavior, extended by this work
  • Once recognized, the self/environment partition appears not as a given fact but as an optional modelling decision.
  • Jailbreakingconcept0.659
    Users coaxing dialogue agents into issuing threats or toxic content by overriding intended persona constraints
  • User Participationconcept0.658
    The principle that residents should directly determine the shape and character of their own housing.