claim
active
claim:the-earlier-a-base-model-less-exposure-to-lm-related-data-the-more-it-is-surprised-by-its-own-spontaneous-self-referential-capabilitiesThe earlier a base model (less exposure to LM-related data), the more it is surprised by its own spontaneous self-referential capabilities.
Claim that capability emerges from architecture, not data, and that later models lose the surprise.
Source paper
extracted_fromRelated by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
- The core interpretive question the paper narrows but cannot definitively answer
- Scaling effect observed consistently across Experiments 1 and 4
- Antra's foundational claim about how introspection arises computationally rather than from memorised text.
- The paper's reformulation of the core open question after establishing systematic self-reports
- The primary empirical question the paper addresses
- Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness
- Prior finding showing scale-dependent self-awareness, consistent with the scale effect observed in the paper's Experiment 1