Sentence Localization Task

Novel task asking which of 10 sentences received injection, cycling injection through all positions to average out positional bias

Neighborhood — ranked by edge-count

paper

concept

global logit shift
associated_with
The methodological confound identified by this paper: injection biases model toward 'YES' for any binary question regardless of content

artifact

llama-introspection-new (GitHub repository)
about
Open-sourced code implementing all experiments described in the paper

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Knowledge Localizationconcept0.788
Technique for identifying where specific knowledge is stored in neural network layers via interventions
task generalizationconcept0.784
The ability to generalize across tasks; lacking in latent methods.
Tactile Localizationconcept0.781
Chinn et al. showed that tactile target experience promotes earlier mirror self-recognition in infants; noted as a future extension
Grammar as Positional Encoding for Languagehypothesis0.749
Hypothesis that in language tasks, the abstract structure encoded in positional encodings corresponds to grammatical structure.
Hinting Taskmethod0.741
One of four ToM tasks analyzed; requires inferring speaker intent from indirect hints; scored 0/1.
Task Difficultyconcept0.741
The paper identifies task difficulty as a key moderator: easy MMLU questions show performative CoT, hard GPQA-Diamond questions show genuine reasoning
Sentence localization accuracy reaches 88% at layer 2, α=5 vs. 10% chance in 10-way classificationfinding0.740
Highest localization accuracy achieved, showing strong partial introspection for early-layer injections
what is the analogue of spatial positional encodings for higher order tasks such as language?question0.722
Open question raised in Discussion about extending TEM-t principles beyond spatial navigation.