concept
active
concept:cross-lingual-truth-representationCross-Lingual Truth Representation
Observation that truth-direction interventions elicit non-English Yes/No equivalents, suggesting language-independent truth encoding
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (1)
finding
- Suggestive evidence for language-independent truth representation in LLMs
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- The underlying truth representation may generalize across lexical choices and languageshypothesis0.791Suggested by non-English Yes/No outputs post-intervention, requiring further investigation
- Acknowledged limitation: simple uncontroversial statements cannot distinguish truth from related epistemic features
- The proposed domain-general property indexed by deception features that governs both factual accuracy and experiential self-report
- Do LLMs have a unified representation of truth that spans structurally and topically diverse data?question0.732Central research question driving dataset design and experimental approach
- Quantitative bound on observed alignment; raises the open question of whether this gap reflects noise or real misalignment
- NLAs revealed unverbalized language processing in Opus 4.6 that led to discovery of malformed SFT training data.
- Implication of PRH for language model visual grounding
- Theoretical open question about the geometry of truth in LLMs raised in Discussion