method
active
method:numeric-self-reportNumeric self-report
Primary tool in human psychometrics for tracking latent internal states; adapted as the core measure in this paper for LLMs
Neighborhood — ranked by edge-count
Thinkers (1)
thinker
- Rensis LikertintroducesDeveloped Likert-scale numeric self-report technique; foundational psychometric precedent for the paper
Concepts (2)
concept
- Convergent validity logicassociated_withFramework borrowed from human metacognition research: when probe and self-report agree, confidence in both increases as they partially track the same underlying state
- Machine psychologyassociated_withEmerging field studying psychological properties of LLMs; the paper aims to bridge psychometric methodology with this field
Methods (2)
method
- Logit-based self-reportextendsPrimary self-report measure: probability-weighted expected value over all ten digit-token logits, yielding a continuous rating that preserves full distributional signal
- Human psychology method for repeated in-situ self-report; methodological inspiration for the paper's approach
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The model's verbal description of its internal state, which may be accurate or confabulated.
- The epistemological core of Alexander's method: the human observer's inner state is a reliable, replicable measuring device for objective properties of the external world
- Central practical conclusion; both methods partially track the same latent state but with different failure modes
- The ability of reasoning LLMs to review and revise previous reasoning steps during inference
- Temperature=0.8 sampled decoding for self-report; reduces collapse moderately but remains discrete and noisy
- Process of reifying one's identity as an independent self; meditation practices aim to decrease selfing.
- Ability of a model to predict its own outputs or behavior, sometimes distinguished from introspection.
- Ability to distinguish one's own outputs from those of other models or humans; related to prefill detection.