concept
active
concept:scaling-monosemanticity-extracting-interpretable-features-from-claude-3-sonnet-templeton-et-al-2024

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet (Templeton et al., 2024)

Key paper on scaling SAE-based interpretability to frontier models, cited as precedent

Neighborhood — ranked by edge-count

Venues (1)

venue

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.