Strength Comparison Task

Novel task asking which of two sentences received a stronger injection, using matched-pairs design to control for positional bias

Neighborhood — ranked by edge-count

paper

concept

global logit shift
associated_with
The methodological confound identified by this paper: injection biases model toward 'YES' for any binary question regardless of content

method

matched-pairs design
uses
Experimental design where injection strengths are swapped between sentences in two parts of each trial to cancel positional preferences

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Task Difficultyconcept0.792
The paper identifies task difficulty as a key moderator: easy MMLU questions show performative CoT, hard GPQA-Diamond questions show genuine reasoning
Task balancingconcept0.787
The problem of ensuring all tasks in MTL perform well, avoiding dominance by some tasks.
Paired Comparison Methodmethod0.782
Experimental protocol asking observers to compare two systems A and B for degree of life; used to establish objectivity through inter-observer convergence
Wholeness Comparison Testmethod0.780
The more general, daily-use version of the mirror-of-self test: asking which of A or B induces greater feeling of wholeness in the observer
Preference Strengthconcept0.778
The problematic possibility of digital minds with superhumanly strong preferences requiring interpersonal utility comparison frameworks
Conditioning strengthsconcept0.774
Parameters controlling the influence of conditioning signals in the generative process.
Task weightconcept0.756
Coefficient weighting each task loss in the MTL objective.
Introspective strengthconcept0.755
Spearman ρ measuring rank-order agreement between logit-based self-report and probe score; the paper's primary monotonic association metric