community
active
leiden_hybrid_concepts
label: haiku
community:leiden_hybrid_concepts-run4-c7-c3Manifold-aware steering for language models
Geometric approach to model control that respects learned concept structure, contrasting with linear steering that produces off-manifold artifacts.
8 members. Each node is clickable.
Loading graph…
Drawn from 6 sources
The papers/notes whose extracted claims & findings make up this cluster.
- 2026-05-15_manifold-overlap-papers-economy-strategy.md3 members
- 2026-05-14_phil-trans-A-goodfire-aboutblank-impact.md1 member
- The World Inside Neural Networks1 member
- Steering Along Manifolds to Control Neural Networks1 member
- Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior1 member
- Steering Along Manifolds to Control Neural Networks1 member
Bridges (2)
Other communities that share members with this one — cross-cutting threads or papers that sit at the seam between two themes.
Claims (6)
- Curved manifolds often represent concepts better than linear directions.Proposes that nonlinear geometric structure is superior to linear feature spaces for capturing semantic content.
- Linear steering cuts through off-manifold regions and hence produces unnatural outputs.Attribution of failure to Euclidean assumption.
- Bicycle for the Soul combines manifold-aware steering with transformative-experience-shaped objectives.
- Manifold-aware steering is genuinely new IP that frontier labs cannot ship as easily as assumed.
- Manifold-aware steering is non-trivial IP requiring geometric analysis, not a system-prompt implementation.
- Manifold-respecting steering produces smooth natural behavioral trajectories while linear steering teleports between non-adjacent concepts.
Findings (2)
- Linear steering produces noisy off-target effects; manifold steering cleanly shifts probability mass between sequential concepts.Core empirical claim comparing steering approaches on cyclic concepts.
- manifold steering produces clean probability shifts along natural behavior structure; linear steering cuts across manifold and produces off-target noisy effectsEmpirical demonstration on Llama-3.1-8B that steering along representation manifold aligns outputs with behavior manifold, whereas linear steering does not.