method
active
method:distinguishing-thoughts-from-text-taskDistinguishing thoughts from text task
Task where the model must simultaneously identify an injected thought and transcribe a text sentence.
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (1)
concept
- Concept InjectionimplementsTechnique of injecting activation patterns associated with specific concepts into a model's internal states to test whether self-reports reflect ground truth.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Speculation about the mechanistic basis of the distinguishing thoughts from text experiment.
- Experimental paradigm where the model is told about the possibility of thought injection and asked to report detection and identification.
- Task of detecting a model's internal thoughts; found by Lindsey (2026) to peak at ~2/3 depth in transformers.
- Core assertion extending William James: thoughts are not passive but active agents that facilitate their own transformation and remapping in cognitive systems.
- William James aphorism cited by Levin to support the idea that thought forms possess minimal agency rather than being purely passive data.
- If a text attempts to stand alone, it will almost certainly attract commentary or interference.hypothesis0.750Predicts the inevitability of dialogic intrusion upon any statement.
- Thought detection peaks at ~2/3 layer depth; intention checking peaks at ~1/2 layer depth.finding0.731Lindsey (2026) differential layer performance explained by Janus's path combinatorics — different tasks use different path distributions.