claim
active
claim:the-ask-arith-prompt-shows-weaker-generalization-to-factual-tasks-compared-to-other-explicit-prompts-suggesting-a-specialized-arithmetic-prompt-does-not-create-a-unified-truth-direction-across-task-familiesThe ask-arith prompt shows weaker generalization to factual tasks compared to other explicit prompts, suggesting a specialized arithmetic prompt does not create a unified truth direction across task families.
From the cross-task generalization heatmaps in Appendix B.3.3.
Source paper
extracted_from(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Finding that explicit correctness framing partially aligns truth directions across task families.
- Key improvement in cross-task generalization enabled by explicit instruction framing.
- Shows the passive vs. active divide is more important than the specific wording of instructions.
- Establishes task difficulty as a hard limit that instructions cannot overcome.
- Establishes F3-F5 as a hard generalization boundary that instructions cannot overcome.
- Specific question motivating the cross-template generalization experiment in Section 5.2.
- Shows that explicit instructions delay the emergence of truth directions in arithmetic tasks.
- Control experiment ruling out token-count as the cause of truth geometry shifts.