question
active
question:what-operation-introduces-the-difficulty-boundary-between-f3-and-f4What operation introduces the difficulty boundary between F3 and F4?
Specific sub-question investigated in Appendix B.4 by creating intermediate task variants.
Source paper
extracted_from(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi
Neighborhood — ranked by edge-count
Claims (1)
claim
- Identified as the exact computational operation that breaks truth direction generalization.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Geometric evidence for convergence to stable truth directions only for simpler tasks.
- Core empirical finding about layer-dependent truth direction emergence across task types.
- Establishes generalizability of the core difficulty-boundary finding across model families.
- A controlled six-level hierarchy of factual tasks increasing in complexity from simple city-location recall to double-counting constraints.
- Summary of the decomposition of S.
- Task difficulty as the key variable distinguishing the two modes of CoT identified in the paper
- The paper identifies task difficulty as a key moderator: easy MMLU questions show performative CoT, hard GPQA-Diamond questions show genuine reasoning
- Demonstrates that early-layer probes capture sentence polarity rather than truth.