concept

active

concept:task-difficulty-operationalized-as-the-number-of-discrete-operations-required-to-verify-correctness-of-the-input

Task difficulty operationalized as the number of discrete operations required to verify correctness of the input.

The paper's specific operationalization of task difficulty enabling controlled experiments.

Neighborhood — ranked by edge-count

Frameworks (2)

framework

Factual task hierarchy (F0–F5)
implements
A controlled six-level hierarchy of factual tasks increasing in complexity from simple city-location recall to double-counting constraints.
Arithmetic task hierarchy (A1–A3)
implements
Three synthetic arithmetic datasets of increasing complexity requiring 1, 2, or 3 operations to verify correctness.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Task Difficultyconcept0.784
The paper identifies task difficulty as a key moderator: easy MMLU questions show performative CoT, hard GPQA-Diamond questions show genuine reasoning
Task difficulty moderates whether CoT is performative or genuine: easy recall questions show performative CoT, difficult multihop questions show genuine reasoningclaim0.748
Task difficulty as the key variable distinguishing the two modes of CoT identified in the paper
Task balancing is still an open problem in multi-task learning.claim0.734
Motivation for the proposed method.
"It is certainly very hard, and perhaps impossible, for mere humans to anticipate and rule out in advance all the disastrous ways the machine could choose to achieve a specified objective."quote0.730
Russell's statement opening Section 2 articulating the core motivation for the Contemplative AI approach
task generalizationconcept0.729
The ability to generalize across tasks; lacking in latent methods.
For a given task, the number of all sequences which work is tiny by comparison with the huge number of all possible sequences; less than a trillionth of all 6 × 10^23 possible sequences actually work well enough.claim0.729
A combinatorial argument that good sequences are astronomically rare, emphasizing the difficulty of discovery.
What operation introduces the difficulty boundary between F3 and F4?question0.725
Specific sub-question investigated in Appendix B.4 by creating intermediate task variants.
Single-process, non-interruptible task switching at command boundaries is sufficient for responsive single-user systems; avoids complexity of multiprocess synchronization.hypothesis0.724
Design hypothesis that coarse-grained task switching (at commands only) eliminates need for protection mechanisms while maintaining usability.