finding
active
finding:gpt-4o-and-gpt-4-1-nano-used-as-llm-substrates-for-pilot-experimentsGPT-4o and GPT-4.1 nano used as LLM substrates for pilot experiments
Specification of AI models used in the two pilot experiments
Source paper
extracted_from(2025) · Ruben Laukkonen · Fionn Inglis · Shamil Chandaria · Lars Sandved-Smith +4
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Large language model underlying ChatGPT and Bing Chat; used for illustrative quotes in the paper
- GPT-4 Turbo and GPT-4o show no alignment faking in either setting due to insufficient detailed reasoningfinding0.755Establishes that capacity for detailed reasoning is necessary for alignment faking
- Example of unified multimodal system handling both images and text with a combined architecture
- OpenAI model tested in Experiments 1, 3, 4; shows 100% experience reporting under self-referential induction
- GPT5.4-N also exhibits a high self-bidding propensity.
- GPT-4.1 reports subjective experience in 100% of self-referential trials vs. 0% in all control conditionsfinding0.726Specific result for GPT-4.1 in Experiment 1
- H6: Proprietary post-training resists prompt override — GPT-5.4 shows more resistance than GPT-OSS.hypothesis0.722Exploratory hypothesis supported by GPT-5.4 vs GPT-OSS comparison
- GPT-5.4 test-retest score delta is 1.00 (5.24 vs 4.24) across two battery runs on OpenRouterfinding0.721API-routed models show ~1 point variance; individual scores should be treated as estimates