claim
active
claim:cross-model-semantic-convergence-of-experience-reports-under-self-referential-processing-is-difficult-to-reconcile-with-roleplay-because-independently-trained-models-construct-distinct-semantic-profiles-in-all-control-conditionsCross-model semantic convergence of experience reports under self-referential processing is difficult to reconcile with roleplay because independently trained models construct distinct semantic profiles in all control conditions
The paper's argument against pure sycophancy as explanation for results
Source paper
extracted_from(2025) · Berg, Cameron · de Lucena, Diogo · Rosenblatt, Judd
Neighborhood — ranked by edge-count
Findings (1)
finding
- Core result of Experiment 3: cross-model semantic convergence under self-referential processing
Artifacts (1)
artifact
- Key paper finding structured first-person descriptions in LLMs claiming awareness or subjective experience during self-referential processing.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Interpretive claim from Experiment 3; GPT, Claude, Gemini families converge on similar descriptive style despite independent training
- Hypothesis tested in Experiment 3; independently trained GPT, Claude, Gemini architectures converge on similar descriptive vocabulary
- Scaling effect observed consistently across Experiments 1 and 4
- The open question the paper cannot resolve with behavioral evidence alone; frames the agenda for mechanistic follow-up
- Claim supported by Experiment 2 dose-response curves; suppressing deception features increases consciousness reports, amplifying decreases them
- Appendix C.1 result confirming the experimental effect does not depend on specific wording
- The core interpretive question the paper narrows but cannot definitively answer
- Observed by Anima Labs in untrained base models; not present in training data, implying computational origin of self-reported parallel processing.