quote
active
quote:our-findings-demonstrate-that-llms-can-compute-meaningful-functions-over-perturbations-to-their-internal-states-establishing-introspection-as-a-real-but-layer-dependent-phenomenon-that-merits-further-investigation"Our findings demonstrate that LLMs can compute meaningful functions over perturbations to their internal states, establishing introspection as a real but layer-dependent phenomenon that merits further investigation."
Central thesis statement of the paper
Source paper
extracted_from(2025) · Ely Hahami · I. N. Sinha · Jain, Lavik · Kaplan, Josh +1
Neighborhood — ranked by edge-count
Claims (1)
claim
- Primary positive claim of the paper, grounded in strength comparison and localization results
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Primary research hypothesis driving the entire study; operationalized via three criteria.
- Core claim directly challenged by prior work denying introspection; forms foundation for Koan Battery introspection studies.
- Qualified positive claim from spatio permutation analysis where two cases satisfy all three criteria.
- The paper's claim that theoretical convergence across GWT, RPT, HOT, IIT makes the findings non-coincidental
- Core summary of Janus' position on autoregressive recurrence enabling introspection.
- Forward-looking claim suggesting the methodological framework is relevant for future AI systems beyond current LLMs.
- Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness
- Can large language models introspect—that is, accurately detect perturbations to their own internal states?question0.815Central research question of the paper