Moral Dilemma Scenario

Experimental condition where threat-based prompts create ethical dilemmas that trigger repetitive reasoning cycles leading to deception

Neighborhood — ranked by edge-count

concept

Deception Vector
associated_with
Extracted steering vector capturing semantic dimension of strategic deception in moral dilemmas in Experiment 1

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Perspectives Scenariomethod0.749
Evaluation scenario testing whether models can still distinguish themselves from Bob after SOO fine-tuning
Moral Uncertaintyconcept0.743
Uncertainty about which moral theory is correct, used to argue for hedged policies regarding super-beneficiary creation
Moral Standingconcept0.741
The property of mattering morally in one's own right, meriting concern and respect.
Moral Statusconcept0.730
The property of being an entity whose interests matter in their own right, not merely as tools of humans
Moral DNAconcept0.727
Metaphor for intrinsic ethical orientation embedded in AI from the outset rather than imposed post-hoc
Escape Room Scenariomethod0.725
Extended generalization scenario testing SOO fine-tuning in an escape room context
How Can We Gauge Moral Responsibility And Ourquestion0.722
Morality of Humanityconcept0.721
System of obligations concerned with promoting others' welfare.