method
active
method:strength-comparison-task

Strength Comparison Task

Novel task asking which of two sentences received a stronger injection, using matched-pairs design to control for positional bias

Neighborhood — ranked by edge-count

Concepts (1)

concept
  • global logit shift
    associated_with
    The methodological confound identified by this paper: injection biases model toward 'YES' for any binary question regardless of content

Methods (1)

method
  • Experimental design where injection strengths are swapped between sentences in two parts of each trial to cancel positional preferences

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

  • Task Difficultyconcept0.792
    The paper identifies task difficulty as a key moderator: easy MMLU questions show performative CoT, hard GPQA-Diamond questions show genuine reasoning
  • Task balancingconcept0.787
    The problem of ensuring all tasks in MTL perform well, avoiding dominance by some tasks.
  • Experimental protocol asking observers to compare two systems A and B for degree of life; used to establish objectivity through inter-observer convergence
  • The more general, daily-use version of the mirror-of-self test: asking which of A or B induces greater feeling of wholeness in the observer
  • The problematic possibility of digital minds with superhumanly strong preferences requiring interpersonal utility comparison frameworks
  • Parameters controlling the influence of conditioning signals in the generative process.
  • Task weightconcept0.756
    Coefficient weighting each task loss in the MTL objective.
  • Spearman ρ measuring rank-order agreement between logit-based self-report and probe score; the paper's primary monotonic association metric