claim
active
claim:linear-steering-cuts-through-off-manifold-regions-and-hence-produces-unnatural-outputsLinear steering cuts through off-manifold regions and hence produces unnatural outputs.
Attribution of failure to Euclidean assumption.
Source paper
extracted_from(2026) · Daniel Wurgaft · Can Rager · Matthew Kowal · Vasudev Shyam +12
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (2)
finding
- Central empirical result showing causal coupling between representation and behavior geometry across multiple substrates and modalities.
- Core empirical result demonstrating the superiority of manifold steering over linear steering
Communities (3)
community
- Explores geometry of activation/behavior manifolds to enable selective, non-destructive concept interventions.
- Concepts encoded as curved manifolds and circular structures in LLM activation spaces.
- Geometric approach to model control that respects learned concept structure, contrasting with linear steering that produces off-manifold artifacts.
Concepts (1)
concept
- Euclidean Geometry AssumptioncontradictsThe implicit assumption of linear steering methods, which the paper argues is inappropriate for neural activation spaces
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Core empirical claim comparing steering approaches on cyclic concepts.
- Empirical demonstration on Llama-3.1-8B that steering along representation manifold aligns outputs with behavior manifold, whereas linear steering does not.
- The paper's critique of the standard linear steering baseline, supported by the days-of-week demo.
- The central thesis of the paper, motivating the shift from linear to geometry-aware manifold steering.
- The research gap that motivates manifold steering as an alternative to conventional linear approaches
- Empirical result demonstrating the failure mode of linear steering when concept geometry is cyclic.
- Typical approach that adds a scaled steering vector to representations; the paper argues this is mismatched with actual representation geometry.