finding
active
finding:under-ask-correct-probes-trained-on-arithmetic-tasks-a1-a3-generalize-almost-perfectly-to-factual-tasks-f0-f2-auroc-1-0-whereas-under-no-prompt-this-generalization-is-largely-absent

Under ask-correct, probes trained on arithmetic tasks A1-A3 generalize almost perfectly to factual tasks F0-F2 (AUROC ~1.0), whereas under no-prompt this generalization is largely absent.

Key improvement in cross-task generalization enabled by explicit instruction framing.

Source paper

extracted_from
Testing the Limits of Truth Directions in LLMs
(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi

Neighborhood — ranked by edge-count

Claims (1)

claim

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.