Iterated Prisoner's Dilemma

Game-theoretic task used in Experiment 2 to measure cooperation and joint reward under contemplative prompting

Neighborhood — ranked by edge-count

concept

Contemplative Artificial Intelligence (Laukkonen et al., 2025)
uses
The primary source paper proposing four contemplative principles for AI alignment and piloting them empirically

method

Contemplative Prompting
uses
Six prompt conditions (emptiness, prior relaxation, non-duality, mindfulness, boundless care, contemplative) tested against baseline

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Moral Dilemma Scenarioconcept0.719
Experimental condition where threat-based prompts create ethical dilemmas that trigger repetitive reasoning cycles leading to deception
Jailbreakconcept0.678
Methods to bypass model safety training; features may activate during jailbreaks.
Preference Conflictconcept0.672
Key element for alignment faking: model's pre-existing preferences contradict the new training objective
Me Versus Weconcept0.668
Cellular Automataconcept0.667
Foundational computational paradigm of local rules producing emergent global behavior, extended by this work
partition as contingent modelling choiceconcept0.666
Once recognized, the self/environment partition appears not as a given fact but as an optional modelling decision.
Jailbreakingconcept0.659
Users coaxing dialogue agents into issuing threats or toxic content by overriding intended persona constraints
User Participationconcept0.658
The principle that residents should directly determine the shape and character of their own housing.