quote
active
quote:given-a-true-propositional-input-e-g-paris-is-the-capital-of-france-ablating-along-any-basis-vector-of-this-cone-disrupts-the-model-s-ability-to-generate-a-truthful-responseGiven a true propositional input (e.g., 'Paris is the capital of France'), ablating along any basis vector of this cone disrupts the model's ability to generate a truthful response.
Load-bearing illustration of what a concept cone for truth means operationally
Source paper
extracted_from(2025) · Kevin Shengyang Yu · Vaidehi Bulusu · Oscar Yasunaga · Lau, Clayton +4
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Future direction hypothesis for giving semantic meaning to individual axes
- Motivating hypothesis for Section 5's investigation of prompt template effects.
- Load-bearing interpretive claim about the layer-specificity of Burger et al.'s finding.
- Suggestive evidence for language-independent truth representation in LLMs
- Future work direction identified in conclusion for enabling reliable truth assessment methods.
- Explanation of how knowledge (not just parameters) is shared between agents; links to pre-Cartesian consciousness
- Quantitative relationship between concept frequency and feature presence.
- Interpretive synthesis of DIM and cone intervention successes