concept
active
concept:gpt-4-turboGPT-4 Turbo
OpenAI model tested; shows no alignment faking due to insufficient detailed reasoning
Neighborhood — ranked by edge-count
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Example of unified multimodal system handling both images and text with a combined architecture
- Large language model cited as an example; also used in Andreas 2022 for preliminary evidence
- GPT-4 was used to generate unique variations of cheap/expensive items and room names for the test dataset
- Early large language model cited as an example of transformer-based LLMs
- GPT-4 Turbo and GPT-4o show no alignment faking in either setting due to insufficient detailed reasoningfinding0.790Establishes that capacity for detailed reasoning is necessary for alignment faking
- One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior
- A family of large language models trained on next-token prediction, central example of simulators.
- Frontier LLM used at temperature 0 to score SJT responses on 1-5 Likert scale conditioned on construct definition and SJT stem