finding
active
finding:steering-along-m-h-yields-behavioral-trajectories-that-follow-m-y-producing-more-natural-outputs-than-linear-steeringSteering along M_h yields behavioral trajectories that follow M_y, producing more natural outputs than linear steering
Core empirical result demonstrating the superiority of manifold steering over linear steering
Source paper
extracted_from(2026) · Daniel Wurgaft · Can Rager · Matthew Kowal · Vasudev Shyam +12
Neighborhood — ranked by edge-count
Claims (3)
claim
- The paper's core causal assertion: geometry is not incidental but mechanistically linked to behavior
- Attribution of failure to Euclidean assumption.
- Central empirical claim of the paper, demonstrated across tasks and modalities
Hypotheses (1)
hypothesis
- We hypothesize that interventions that respect the geometry of activation space will yield behaviors close to those the model exhibits naturallyassociated_withsupportsThe core testable hypothesis driving the experimental design
Questions (1)
question
- The motivating research question of the paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Empirical demonstration on Llama-3.1-8B that steering along representation manifold aligns outputs with behavior manifold, whereas linear steering does not.
- The paper's critique of the standard linear steering baseline, supported by the days-of-week demo.
- Core empirical claim comparing steering approaches on cyclic concepts.
- The central thesis of the paper, motivating the shift from linear to geometry-aware manifold steering.
- Central empirical result showing causal coupling between representation and behavior geometry across multiple substrates and modalities.
- Observation from 100% accuracy on specific concept-layer-strength combinations suggesting concept-specific detectability
- Extension of manifold steering validation to video world models and physical dynamics tasks, demonstrating cross-modal generality