method
active
method:random-vector-baselineRandom vector baseline
Baseline method sampling a random vector as feature direction for comparison with learned methods
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Random vectors require larger norm to trigger detection (8 vs 2); elicit awareness at lower rates (9/100); negated vectors comparably effective but model identification confabulated.
- The set of mutually orthogonal unit vectors that span the concept cone, each independently causally mediating target behavior
- Baseline model stitching trained in a single behavioral direction without CL auxiliary loss, used for comparison with CLMAS.
- Layer-40 activations with the component explained by compressed Gemini embeddings subtracted, isolating information not driven by surface text content
- Control using objectively-NO factual questions under identical injection to measure global logit shift vs. genuine detection signal
- Control condition with steering disabled to confirm self-correction is induced by steering, not spontaneous
- Computed directional vector in activation space representing a specific concept, used for injection experiments
- Controls for variance by sampling random directions from top-k PC spaces matching each emotion probe's explained variance, and subtracting median persistence of 20 matched directions