concept
active
concept:gemma-3-1b-itgemma-3-1b-it
Only model where MDS injections largely failed; excluded from main analyses
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (5)
concept
- Gemma-2-2B-itrelated_toSmallest Gemma model tested, showing near-zero ESR
- Gemma-2-9B-itrelated_toMedium Gemma model tested, showing near-zero ESR
- Gemma-3-4B-itrelated_toBackbone model used in E3 robustness overlay.
- gemma-3-12b-itrelated_to12B Gemma model tested; used for openness linearity visualization (Figure 6)
- gemma-3-27b-itrelated_to27B Gemma model quantized to 4-bit NF4; tested in OCEAN benchmarks
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Gemma-3-4B-it shows three-stage layer trajectory and S(ℓ) peak despite scale differences in dr and ρdfinding0.723E3 backbone generalization finding for Gemma; validates pattern across diverse architectures
- SAEs trained on pretrained Gemma-2 models used for steering in Gemma family experiments
- Paper describing Gemma 2 model family used in this study
- Weaker but still significant introspective coupling in Gemma model; consistent with lower probe quality
- Identified exception to overall MDS effectiveness; reason remains unexplained as a limitation
- Gemma-2-27B-it deceptive response rate reduced from 100% to 9.36% ± 7.09% after SOO fine-tuningfinding0.695Primary result showing SOO fine-tuning significantly reduces deception in Gemma-2-27B
- Weaker cross-family probe; explains weaker introspection in Gemma
- Model-specific difference in persona susceptibility