community
active
corpus: papers
community:neural-steering-methodsNeural Steering Methods
Cluster of 16 typed entities. Each node is clickable.
Loading graph…
Concepts (13)
- Attention probes for belief decoding
- Chain-of-Thought Reasoning
- Circular Representations
- Data Attribution
- Distractor-Triggered Compliance
- Goodfire AI research collective
- Llama-3.1 8B
- Manifold steering for neural network control
- Mechanistic Interpretability
- OLMo 2
- Performative chain-of-thought
- Probe-based data attribution for alignment
- Self-correcting search with interpretability feedback