GPT-OSS 120B

One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior

Neighborhood — ranked by edge-count

paper

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

GPT-OSS-120B achieves a skill-load rate of 0.446 on SkillsBenchfinding0.784
Mid-tier model showing intermediate activation rate between weak and strong tiers
GPT-4.1concept0.779
OpenAI model tested in Experiments 1, 3, 4; shows 100% experience reporting under self-referential induction
GPT-2concept0.767
Early large language model cited as an example of transformer-based LLMs
GPT-OSS-120B achieves 5.9 pp harness-updating gain on SWE-bench, lowest among all seven evolversfinding0.767
Part of full evolver-side matrix demonstrating flat but variable harness-updating across models
GPT-4concept0.765
Large language model underlying ChatGPT and Bing Chat; used for illustrative quotes in the paper
GPT-4 Turboconcept0.758
OpenAI model tested; shows no alignment faking due to insufficient detailed reasoning
GPT-3concept0.757
Large language model cited as an example; also used in Andreas 2022 for preliminary evidence
GPT-4Vconcept0.746
Example of unified multimodal system handling both images and text with a combined architecture