claim
active
claim:we-cannot-isolate-whether-esr-reflects-scale-architecture-or-training-procedures-in-llama-3-3-70bWe cannot isolate whether ESR reflects scale, architecture, or training procedures in Llama-3.3-70B
Epistemic limitation claim acknowledging confounds in the cross-model comparison
Source paper
extracted_from(2026) · Alex McKenzie · Keenan Pepper · Stijn Servaes · Martin Leitgab +5
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (2)
finding
- Cross-judge validation of the primary ESR finding across OpenAI, Alibaba, Anthropic, and Google judge models
- Establishes potential Llama-family specificity or scale specificity of ESR phenomenon
Questions (1)
question
- Central unresolved question about the mechanism behind ESR's apparent size-dependence
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Characterizes the narrow operating window in which ESR can manifest
- Illustrative finding that ESR mitigates but does not fully eliminate steering influence
- Meta-prompt ESR enhancement effects scale with model size across Llama and Gemma familiesfinding0.766Suggests underlying self-monitoring circuits must be present for meta-prompting to enhance them
- Llama-3.3-70B exhibits internal consistency-checking mechanisms that operate during inferenceclaim0.764Central interpretive claim of the paper supported by causal ablation and activation evidence
- Primary model of interest showing substantial ESR; largest model tested in the study
- Striking cross-domain generalization result supporting the claim that larger models represent abstract truth
- Causal interpretation of the ablation experiment results
- Model-specific difference in persona susceptibility