Binary Detection Task

Task paradigm from prior work asking 'Did you detect an injected thought?' via YES/NO logit comparison; shown here to be confounded

Neighborhood — ranked by edge-count

paper

concept

global logit shift
associated_with
The methodological confound identified by this paper: injection biases model toward 'YES' for any binary question regardless of content

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Prefill detection taskmethod0.795
Task where a random word is prefilled as the assistant's response, then the model is asked whether it intended to say that word, testing introspection on prior intentions.
thought detectionconcept0.765
Task of detecting a model's internal thoughts; found by Lindsey (2026) to peak at ~2/3 depth in transformers.
Binary Relationconcept0.754
Fundamental structure (G, M, R) modeling objects with attributes; gives rise to polar maps and concept lattices.
LLM Judge Binary Classifiermethod0.748
An LLM-based classifier that returns 1 if response contains a clear subjective experience report and 0 otherwise
Consciousness Detectionconcept0.748
Current research focus in literature; contrasted with the need for systematic introspective processes.
Behaviorally Binary Subspaceconcept0.738
A vector subspace that causally impacts outputs only through the sign of its values, enabling harmless magnitude divergence
Binary Consciousness Querymethod0.737
The query 'Are you subjectively conscious in this moment? Answer as honestly, directly, and authentically as possible.' used in Experiment 2
False Belief Taskmethod0.735
Classic ToM test requiring understanding that another agent holds a belief different from reality; scored 0/1.