finding
active
finding:linear-steering-on-llama-3-1-8b-for-the-days-of-week-task-cuts-across-the-behavior-manifold-producing-noisy-off-target-effects-where-predicted-tokens-are-not-even-days-of-the-week

Linear steering on Llama-3.1 8B for the days-of-week task cuts across the behavior manifold, producing noisy off-target effects where predicted tokens are not even days of the week.

Empirical result demonstrating the failure mode of linear steering when concept geometry is cyclic.

Neighborhood — ranked by edge-count

Claims (2)

claim

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.