About Blank / Research
  • Overview
  • Pipeline
Papers
  • Papers
  • Top cited
  • All references
Knowledge graph
  • Frameworks
  • Methods
  • Findings
  • Claims
  • Hypotheses
  • Predictions
  • Questions
People & places
  • Thinkers
  • Institutes
  • Venues
Communities & gaps
  • Communities
  • Knowledge graph
  • Gaps
External
  • Datasette
community
active
corpus: papers
community:neural-steering-methods

Neural Steering Methods

Cluster of 16 typed entities. Each node is clickable.

Loading graph…

Concepts (13)

  • Attention probes for belief decoding
  • Chain-of-Thought Reasoning
  • Circular Representations
  • Data Attribution
  • Distractor-Triggered Compliance
  • Goodfire AI research collective
  • Llama-3.1 8B
  • Manifold steering for neural network control
  • Mechanistic Interpretability
  • OLMo 2
  • Performative chain-of-thought
  • Probe-based data attribution for alignment
  • Self-correcting search with interpretability feedback

Papers (3)

  • Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training
  • Reasoning Theater: Probing for Performative Chain-of-Thought
  • Using Self-Correcting Search to Accelerate Materials Discovery