concept
active
concept:llama-3-1-405bLlama 3.1 405B
Large open-weight model showing compliance gap in helpful-only setting
Neighborhood — ranked by edge-count
Concepts (3)
concept
- LLaMA 3.3 70Brelated_toThe model used in Experiment 2 for SAE feature steering experiments via Goodfire API
- LLaMA3.1-8Brelated_toOne of four LLMs selected for representation analysis; embedding dimension D=4096; used as demonstration model in scatter plots.
- LLaMA3.1-70Brelated_toOne of four LLMs selected; larger model with D=8192 embedding dimension; analyzed across proportionally aligned layers.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Primary qualitative demonstration model and one of 14 LLMs benchmarked
- 3B Llama model tested; used for injection stride visualization
- Backbone model used in E3 geometry analysis.
- Smallest Llama model tested; benchmarked across all injection methods
- Primary model of interest showing substantial ESR; largest model tested in the study
- Language model family used in cross-modal alignment experiments across multiple sizes
- Meta's open large language model cited as an example of the class of models under discussion
- Replication across open-weight models supports scale-emergence finding