question
active
question:does-das-scale-with-large-foundation-modelsDoes DAS scale with large foundation models?
Practical scalability question addressed in Appendix D.
Source paper
extracted_from(2023) · Atticus Geiger · Zhengxuan Wu · Christopher Potts · Thomas Icard +1
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Motivating question for the paper, addressed by scaling SAEs to Claude 3 Sonnet.
- Large pretrained models used as backbones across tasks; their universality motivates the convergence hypothesis
- Bigger models are more likely to converge to a shared representation than smaller modelshypothesis0.728Selective pressure toward convergence via model capacity
- Survey of representation engineering methods cited as related work
- Interpretive claim connecting scale to abstraction level in LLM representations
- Scaling laws for dictionary learning are unknown and needed to assess feasibility on frontier models
- Clamping feature activations causally alters model behavior in interpretable ways.