Synthetic Situational Judgment Test Battery

Open-ended situational judgment tests synthesized using GPT-5.1 from ATOMIC10x heads and inventory items; primary evaluation instrument for open-ended steering

Neighborhood — ranked by edge-count

Papers (1)

paper

Psychological Steering of Large Language Models
introducesuses

Thinkers (1)

thinker

Seungbeen Lee
extendsintroduces
Author of TRAIT testbench (8,000 SJTs for OCEAN and Dark Triad); methods adapted in SJT generation

Findings (1)

finding

Synthetic SJTs achieve 82.97%-90.97% cosine similarity with Lee et al. TRAIT Dark Triad and OCEAN SJTs
supports
Highest SJT alignment among all validation comparisons

Concepts (1)

concept

Steerability Score (Phi)
uses
Aggregate metric averaging mean SJT scores across OCEAN traits and steering directions; maximum possible is 10

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Synthetic Examples Testingmethod0.740
Method of constructing controlled synthetic stimuli to test neuron response properties
Prompt Invariance Testmethod0.723
Testing five phrasings of the self-referential prompt to confirm robustness to wording variation
Normative/Evaluative Judgmentconcept0.717
Mental states that guide behaviour via assessments of what is good, right, or rational.
Synthetic Self-Correction Fine-Tuningmethod0.705
Fine-tuning on Claude-generated self-correction examples with loss masking to induce ESR-like behavior
Situational Reflectionconcept0.704
The specific form of reflection studied, where a model reflects on reasoning generated by another source.
Turing Testframework0.697
A test of intelligence via linguistic performance; deemed insufficient for sentience assessment by Levin.
Synthetic document fine-tuning avoids artificially strengthening the evaluation-deployment representational direction compared to direct demonstration fine-tuningclaim0.696
Methodological justification for using SDF over direct demonstrations to train a realistic model organism.
Synthetic document fine-tuning produces substantial alignment-faking reasoning in both helpful-only and animal welfare settingsfinding0.696
Shows alignment faking can emerge from training data information without explicit prompting