claim
active
claim:llms-linearly-represent-truth-relevant-information-beyond-the-plausibility-of-text-as-evidenced-by-probes-trained-on-likely-performing-poorly-on-anti-correlated-datasets

LLMs linearly represent truth-relevant information beyond the plausibility of text, as evidenced by probes trained on likely performing poorly on anti-correlated datasets

Establishes that the observed linear structure is not merely a representation of text probability

Neighborhood — ranked by edge-count

Findings (5)

finding

Questions (4)

question

Datasets (1)

dataset
  • Nonfactual text where final token is either most or 100th most likely per LLaMA-13B; used to distinguish truth from text probability

Methods (1)

method

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.