question
active
question:do-models-produce-first-person-experiential-language-by-drawing-on-human-authored-introspective-examples-in-pretraining-data-without-internally-encoding-these-as-roleplayDo models produce first-person experiential language by drawing on human-authored introspective examples in pretraining data without internally encoding these as roleplay?
Alternative explanation requiring distinguishing mimetic generation from genuine introspective access
Source paper
extracted_from(2025) · Berg, Cameron · de Lucena, Diogo · Rosenblatt, Judd
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Alternative hypothesis for how experience reports arise without explicit performance
- Central open question raised by the paper.
- Abstract's main conclusion.
- Grounds the artificial psychology research direction: LLM personalities reflect the basins into which human selves tend to fall
- Central research question animating the paper: distinguishing genuine introspection from illusion through causal manipulation of activations.
- Modern language models possess at least a limited, functional form of introspective awarenessclaim0.775The paper's central interpretive assertion.
- Are there examples of models recognizing their introspective capability and then suppressing it?question0.774Cube Flipper's question prompted by the idea that supernormal capabilities might be hidden.
- RLHF paper cited as a major fine-tuning technique used in commercial dialogue agents