question

active

question:can-instruction-tuned-llms-perform-quantitative-introspection-of-emotive-states-in-conversation

Can instruction-tuned LLMs perform quantitative introspection of emotive states in conversation?

Central research question motivating the entire paper

Source paper

extracted_from

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

(2026) · Nicolas Martorell · Bianchi, Bruno

Neighborhood — ranked by edge-count

Findings (1)

finding

Interest concept: Spearman ρ=0.76, isotonic R²=0.54 between logit self-report and probe score in LLaMA-3.2-3B (n=400)
answered_by
Strongest pooled introspective coupling across the four emotive concepts in the primary model

Claims (1)

claim

Numeric self-report is a viable, complementary black-box tool for monitoring LLM internal emotive states alongside white-box probe methods
gates
Central practical conclusion; both methods partially track the same latent state but with different failure modes

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Emotive states in LLMsconcept0.792
Directions in activation space associated with contrastive emotive concept pairs studied in this paper as targets for introspection
LLMs can compute meaningful functions over perturbations to their internal states, establishing introspection as a real but layer-dependent phenomenonclaim0.791
Primary positive claim of the paper, grounded in strength comparison and localization results
What are the mechanistic bases of introspective awareness in LLMs?question0.785
Secondary question; paper demonstrates introspection but explicitly avoids pinning down specific mechanistic explanation, noting mechanisms could be shallow and specialized.
"Our findings demonstrate that LLMs can compute meaningful functions over perturbations to their internal states, establishing introspection as a real but layer-dependent phenomenon that merits further investigation."quote0.783
Central thesis statement of the paper
Connecting the Dots: LLMs Can Infer and Verbalize Latent Structure from Disparate Training Data (Treutlein et al. 2024)concept0.782
Out-of-context reasoning work directly related to synthetic document fine-tuning experiments
If a dialogue agent is prompted with knowledge of its own LLM nature, it will enact a superposition of theories of selfhood, narrowing as conversation proceedshypothesis0.781
Conditional prediction about how a well-informed dialogue agent would handle questions of personal identity
LLMs can predict their own responses more accurately than external observers, implying privileged internal knowledgefinding0.780
Binder et al. finding cited as evidence that LLMs possess introspective capacity analogous to mindfulness
How are LLMs actually leveraging the architectural degrees of freedom for introspection in practice?question0.774
Janus notes that while architecture permits introspection, it is a separate question how models use it.