venue
active
venue:transformer-circuits-threadTransformer Circuits Thread
Anthropic's mechanistic interpretability research blog where this paper was published.
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (4)
concept
- Foundational mechanistic interpretability paper on transformer circuit analysis
- Related work demonstrating LLM introspective capabilities with scale-dependent pattern paralleling ESR
- Key paper on scaling SAE-based interpretability to frontier models, cited as precedent
- Towards Monosemanticity: Decomposing Language Models with Dictionary Learning (Bricken et al., 2023)citesFoundational SAE mechanistic interpretability paper
Artifacts (1)
artifact
- Key paper finding structured first-person descriptions in LLMs claiming awareness or subjective experience during self-referential processing.
Events (1)
event
- Transformer Circuits paper identifying emotion-concept representations influencing safety behaviors; key related work published April 2026