claim
active
claim:models-are-not-merely-tracking-dialogue-context-features-same-concept-steering-shows-privileged-internal-access-is-necessary-to-explain-self-report-shifts

Models are not merely tracking dialogue context features; same-concept steering shows privileged internal access is necessary to explain self-report shifts

Addresses skeptical alternative that reports reflect only conversational content

Source paper

extracted_from
Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation
(2026) · Nicolas Martorell · Bianchi, Bruno

Neighborhood — ranked by edge-count

Findings (1)

finding

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.