Intentional control task

Task instructing the model to write a sentence while thinking or not thinking about a word, measuring internal representation strength.

Neighborhood — ranked by edge-count

paper

concept

Concept Injection
implements
Technique of injecting activation patterns associated with specific concepts into a model's internal states to test whether self-reports reflect ground truth.

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Control task for causal evaluationmethod0.834
Adaptation of Hewitt and Liang control tasks to CausalGym: next-token labels replaced with arbitrary tokens to measure method expressivity
controlconcept0.794
The act of directing a system's behavior; the objective of a regulator.
Intentional Control of Internal Statesfinding0.786
Models can modulate their internal representations when instructed or incentivized to 'think about' a concept; effect replicates across all tested models regardless of capability.
Intentional Actionconcept0.782
Central explanatory target: behavior constrained by prior intentions and contextual constraints that emerge from cognitive reorganization.
Intentional Agencyconcept0.768
Capacity to set and pursue goals via beliefs, desires, and intentions.
Control over one's environmentconcept0.751
The ability of individuals and communities to shape, own, and modify their living spaces; a prerequisite for belonging
Task balancingconcept0.745
The problem of ensuring all tasks in MTL perform well, avoiding dominance by some tasks.
Conceptual Control Conditionconcept0.741
Control directly priming consciousness ideation without inducing self-reference; yields near-zero experience claims