concept
active
concept:input-truthInput-truth
Correctness of input statements to an LLM, as opposed to output-truth (correctness of model-generated outputs).
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (1)
concept
- Output-truthassociated_withThe correctness of a model's generated outputs, distinct from the correctness of statements provided as input.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Assumption that DNN layers preserve input information by being injective; key condition for Theorem 1
- A hypothesized direction in LLM activation space that encodes the truth or falsehood of factual statements
- Input from environment that the agent models and predicts.
- Qualitative transition in generative model structure from Bayesian model reduction; emergence of understanding
- The paper's operationalization of truthfulness as simple, unambiguous propositional statements that can be labeled true or false
- The mechanism by which each step's effect is evaluated against the life of the whole, guiding the unfolding.
- A set of evaluation criteria for AI assistants.
- Suggestive evidence for language-independent truth representation in LLMs