concept
active
concept:gpt-oss-120b

GPT-OSS 120B

One of two large reasoning models analyzed in the paper for performative vs genuine CoT behavior

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Mid-tier model showing intermediate activation rate between weak and strong tiers
  • GPT-4.1concept0.779
    OpenAI model tested in Experiments 1, 3, 4; shows 100% experience reporting under self-referential induction
  • GPT-2concept0.767
    Early large language model cited as an example of transformer-based LLMs
  • Part of full evolver-side matrix demonstrating flat but variable harness-updating across models
  • GPT-4concept0.765
    Large language model underlying ChatGPT and Bing Chat; used for illustrative quotes in the paper
  • GPT-4 Turboconcept0.758
    OpenAI model tested; shows no alignment faking due to insufficient detailed reasoning
  • GPT-3concept0.757
    Large language model cited as an example; also used in Andreas 2022 for preliminary evidence
  • GPT-4Vconcept0.746
    Example of unified multimodal system handling both images and text with a combined architecture