claim
active
claim:pure-factual-recall-tasks-f0-f2-show-robust-auroc-performance-across-all-instruction-template-variationsPure factual-recall tasks F0-F2 show robust AUROC performance across all instruction template variations.
Contrasts with harder tasks that are sensitive to prompt variations.
Source paper
extracted_from(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi
Neighborhood — ranked by edge-count
Claims (1)
claim
- Truth directions fail to generalize to harder tasks (F3-F5) regardless of prompt template because activations remain highly entangled.associated_withEstablishes task difficulty as a hard limit that instructions cannot overcome.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Establishes F3-F5 as a hard generalization boundary that instructions cannot overcome.
- Core empirical finding about layer-dependent truth direction emergence across task types.
- Key improvement in cross-task generalization enabled by explicit instruction framing.
- Within-family factual generalization (F0-F2) is consistently strong across all models and prompt settings.finding0.780Establishes a reliable baseline for factual truth direction universality within simple factual recall.
- Demonstrates that early-layer probes capture sentence polarity rather than truth.
- Shows instruction effects extend to harder factual tasks.
- Generalization evidence that truth probes are not invariant to model instructions.
- From the cross-task generalization heatmaps in Appendix B.3.3.