Meta-Llama-3.1-8B-Instruct

Backbone model used in E3 geometry analysis.

Neighborhood — ranked by edge-count

paper

method

E3: Layer-wise Geometric Trajectory Analysis
uses
Quantitative study correlating layer-wise anchoring geometry (S_max, AUS_N) with behavioral thresholds θ50

concept

Llama-3.1-8B-Instruct
related_to
Primary qualitative demonstration model and one of 14 LLMs benchmarked
Llama-3.3-70B-Instruct
related_to
Primary model of interest showing substantial ESR; largest model tested in the study
Llama-3.2-3B-Instruct
related_to
3B Llama model tested; used for injection stride visualization
LLaMA3.1-8B
related_to
One of four LLMs selected for representation analysis; embedding dimension D=4096; used as demonstration model in scatter plots.
Llama-3.2-1B-Instruct
related_to
Smallest Llama model tested; benchmarked across all injection methods

finding

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

Olmo-3.1-32B-Instructconcept0.827
32B OLMo model quantized to 4-bit NF4; tested in OCEAN benchmarks
LLaMA3.1-70Bconcept0.820
One of four LLMs selected; larger model with D=8192 embedding dimension; analyzed across proportionally aligned layers.
Llama 3.1 405Bconcept0.816
Large open-weight model showing compliance gap in helpful-only setting
LLaMA 3.3 70Bconcept0.806
The model used in Experiment 2 for SAE feature steering experiments via Goodfire API
Olmo-3-7B-Instructconcept0.799
7B OLMo model tested; used for layerwise steering visualization (Figure 4)
Meta-prompting increases Llama-3.3-70B multi-attempt rate 4.3× (from 7.4% to 31.7%)finding0.789
Demonstrates ESR can be deliberately enhanced through prompting in the largest model
Llama 3.3 70B is the most likely to take on a non-Assistant persona when steered, with even split between human and nonhuman portrayalsfinding0.777
Model-specific difference in persona susceptibility
LLaMA / LLaMA2 / LLaMA3concept0.775
Language model family used in cross-modal alignment experiments across multiple sizes