method
active
method:random-word-prefix-control-prompt-random-promptRandom word prefix control prompt (random-prompt)
Control prompt with random words of same length as ask-correct to isolate token-count confounds.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Control experiment ruling out token-count as the cause of truth geometry shifts.
- Control prompt 'Read the following sentence...' to test generic instruction-following effects.
- Natural-language harness artifacts that encode standing behavioral rules, task policies, and reasoning procedures
- A technique used in the paper to alter prompts so they contain fewer hints that the interaction is a safety evaluation.
- Control experiment ablating random latents matched for activation frequency and magnitude to test OTD specificity
- Task instructing the model to write a sentence while thinking or not thinking about a word, measuring internal representation strength.
- Providing k labeled examples in the prompt to steer model behavior.
- The minimal prompt directing models to 'focus on any focus itself' without invoking consciousness vocabulary; the main experimental manipulation