concept
active
concept:llama3-1-8bLLaMA3.1-8B
One of four LLMs selected for representation analysis; embedding dimension D=4096; used as demonstration model in scatter plots.
Neighborhood — ranked by edge-count
Concepts (5)
concept
- Llama-3.1-8B-Instructrelated_toPrimary qualitative demonstration model and one of 14 LLMs benchmarked
- Meta-Llama-3.1-8B-Instructrelated_toBackbone model used in E3 geometry analysis.
- LLaMA 3.3 70Brelated_toThe model used in Experiment 2 for SAE feature steering experiments via Goodfire API
- LLaMA3.1-70Brelated_toOne of four LLMs selected; larger model with D=8192 embedding dimension; analyzed across proportionally aligned layers.
- Llama 3.1 405Brelated_toLarge open-weight model showing compliance gap in helpful-only setting
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Language model family used in cross-modal alignment experiments across multiple sizes
- 3B Llama model tested; used for injection stride visualization
- Primary model of interest showing substantial ESR; largest model tested in the study
- Goodfire blog post describing SAEs used for Llama models in this study
- LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)finding0.804Seed-pooled geometry-only statistics (per-dev z units).
- Smallest Llama model tested; benchmarked across all injection methods
- Llama-3.3-70B exhibits internal consistency-checking mechanisms that operate during inferenceclaim0.786Central interpretive claim of the paper supported by causal ablation and activation evidence
- Model-specific difference in persona susceptibility