finding
active
finding:qwen-2-5-14b-mean-kl-divergence-on-alpaca-prompts-after-truth-direction-ablation-is-0-038

Qwen-2.5-14B mean KL divergence on Alpaca prompts after truth-direction ablation is 0.038

Experiment 3 result showing minimal behavioral drift from truth intervention in Qwen 14B

Source paper

extracted_from
From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs
(2025) · Kevin Shengyang Yu · Vaidehi Bulusu · Oscar Yasunaga · Lau, Clayton +4

Neighborhood — ranked by edge-count

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.