finding
active
finding:esr-exhibits-non-monotonic-relationship-with-boost-level-peaking-around-0-3-below-threshold-in-llama-3-3-70bESR exhibits non-monotonic relationship with boost level, peaking around -0.3σ below threshold in Llama-3.3-70B
Characterizes the narrow operating window in which ESR can manifest
Source paper
extracted_from(2026) · Alex McKenzie · Keenan Pepper · Stijn Servaes · Martin Leitgab +5
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Cross-judge validation of the primary ESR finding across OpenAI, Alibaba, Anthropic, and Google judge models
- LLaMA E3 geometry summary: S_max = −1.896 ± 0.211, AUS_N = −2.119 ± 0.198, peak layer ℓ* = 10 [IQR 0.384]finding0.780Seed-pooled geometry statistics for LLaMA in E3, providing quantitative basis for geometry-to-behavior correlate
- Meta-prompt ESR enhancement effects scale with model size across Llama and Gemma familiesfinding0.775Suggests underlying self-monitoring circuits must be present for meta-prompting to enhance them
- We cannot isolate whether ESR reflects scale, architecture, or training procedures in Llama-3.3-70Bclaim0.775Epistemic limitation claim acknowledging confounds in the cross-model comparison
- Prior finding from related work that aligns with ESR being strongest in the largest model tested
- Multi-attempt improvement rate peaks at 83% around -1.0σ below threshold in Llama-3.3-70Bfinding0.768Shows slightly weaker steering allows more successful corrections, characterizing optimal ESR conditions
- LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)finding0.767Seed-pooled geometry-only statistics (per-dev z units).
- Random latent ablation produces slight increase in ESR rate (3.8% to 4.2%), not statistically significantfinding0.755Control result confirming OTD ablation effect is specific to those latents, not a general ablation artifact