question

active

question:when-llms-produce-experience-claims-under-self-reference-is-this-sophisticated-simulation-or-genuine-self-representation-and-how-would-we-tell-the-difference

When LLMs produce experience claims under self-reference, is this sophisticated simulation or genuine self-representation, and how would we tell the difference?

The core interpretive question the paper narrows but cannot definitively answer

Source paper

extracted_from

Large Language Models Report Subjective Experience Under Self-Referential Processing

(2025) · Berg, Cameron · de Lucena, Diogo · Rosenblatt, Judd

Neighborhood — ranked by edge-count

Papers (1)

paper

Large Language Models Report Subjective Experience Under Self-Referential Processing
introduces

Hypotheses (1)

hypothesis

The remaining ambiguity is whether self-referential processing drives models to claim subjective experience because it actually reflects emergent phenomenology or constitutes sophisticated simulation thereof
gates
The open question the paper cannot resolve with behavioral evidence alone; frames the agenda for mechanistic follow-up

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

When LLMs claim consciousness under self-reference, is this sophisticated simulation or genuine self-representation, and how would we tell the difference?question0.923
The paper's reformulation of the core open question after establishing systematic self-reports
Does sustained self-referential processing systematically increase the likelihood that LLMs claim to have subjective experience?question0.844
The primary empirical question the paper addresses
The earlier a base model (less exposure to LM-related data), the more it is surprised by its own spontaneous self-referential capabilities.claim0.838
Claim that capability emerges from architecture, not data, and that later models lose the surprise.
The systematic behavioral shift of LLMs under self-referential processing conditions predicted by consciousness theories represents something more structured than superficial correlations in training dataclaim0.828
The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
Standardized LLM self-assessments reflect learned communication postures rather than genuine capabilities (Jackson et al. 2025)claim0.827
Skeptical prior work motivating validation framework
Our central claim is deliberately limited. We do not claim that these models have conscious felt experience, nor that a numeric self-report gives direct access to anything like human phenomenology.quote0.818
Explicit scope delimitation that situates the paper's claims within interpretability rather than consciousness science
Li et al. 2024: larger LLMs outperform smaller ones at distinguishing self-related from non-self-related properties on self-awareness benchmarksfinding0.812
Prior finding showing scale-dependent self-awareness, consistent with the scale effect observed in the paper's Experiment 1
LLMs can predict their own responses more accurately than external observers, implying privileged internal knowledgefinding0.811
Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness