framework
active
framework:factual-task-hierarchy-f0-f5Factual task hierarchy (F0–F5)
A controlled six-level hierarchy of factual tasks increasing in complexity from simple city-location recall to double-counting constraints.
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (1)
concept
- The paper's specific operationalization of task difficulty enabling controlled experiments.
Datasets (6)
dataset
- Factual statements of the form 'The city of X is in Y', adopted from Marks & Tegmark (2024); 1,594 examples.
- Negated variants of F0, e.g., 'The city of Boston is not in Australia'; 1,594 examples.
- Conjunctions of two independent factual city statements; 1,594 examples.
- Counting constraints over a list of 2 cities from the same country; 2,000 examples.
- Extended counting task over N=5 cities; 2,000 examples.
- Double counting task over 6 cities in two countries; 2,000 examples.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Three synthetic arithmetic datasets of increasing complexity requiring 1, 2, or 3 operations to verify correctness.
- Within-family factual generalization (F0-F2) is consistently strong across all models and prompt settings.finding0.772Establishes a reliable baseline for factual truth direction universality within simple factual recall.
- Establishes F3-F5 as a hard generalization boundary that instructions cannot overcome.
- Contrasts with harder tasks that are sensitive to prompt variations.
- Core empirical finding about layer-dependent truth direction emergence across task types.
- Task where the input is a sequence w,x,y,z and the label is (w=x)=(y=z); used to test relational reasoning in developmental/cognitive psychology.
- Finding that explicit correctness framing partially aligns truth directions across task families.
- Establishes task difficulty as a hard limit that instructions cannot overcome.