concept
active
concept:gemmascope-saesGemmaScope SAEs
SAEs trained on pretrained Gemma-2 models used for steering in Gemma family experiments
Neighborhood — ranked by edge-count
Thinkers (1)
thinker
- Tom LieberumstudiesLead author of GemmaScope paper, providing the SAEs used for Gemma-2 models
Datasets (1)
dataset
- Gemma-2-27B-itassociated_with27B parameter LLM used in SOO fine-tuning experiments
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Backbone model used in E3 robustness overlay.
- Medium Gemma model tested, showing near-zero ESR
- Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 (Lieberum et al., 2024)concept0.714Paper introducing GemmaScope SAEs used for Gemma-2 model experiments
- Only model where MDS injections largely failed; excluded from main analyses
- Smallest Gemma model tested, showing near-zero ESR
- Unifying framework for inspecting hidden representations of language models via representation interventions
- 12B Gemma model tested; used for openness linearity visualization (Figure 6)
- Extension of mechanistic interpretability findings to the metacognitive domain