finding
active
finding:systematic-layer-20-28-degradation-in-s-l-to-s-2-40-by-layer-27-on-llamaSystematic layer 20-28 degradation in S(ℓ) to S ≈ −2.40 by layer 27 on LLaMA
Validates representational drift theory: later layers specialize for next-token prediction, increasing dr
Source paper
extracted_from(2025) · Edward Yi Chang · Kaya, Zeyneb N. · Ethan Chang
Neighborhood — ranked by edge-count
Concepts (1)
concept
- representational driftassociated_withsupportsAccumulation of mismatch in later layers causing S degradation.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- One of the most promising cases; approximately corresponds to the 2/3 layer of LLaMA3.1-8B.
- LLaMA-3.1-8B: Sbmax = -1.896 ± 0.211, AUSN = -2.119 ± 0.198, peak layer ℓ* = 10 (median)finding0.798Seed-pooled geometry-only statistics (per-dev z units).
- Math and code tasks show strongest mid-layer anchoring on LLaMA (S ≈ −1.65 at layers 8-12)finding0.796Task-specific E3 finding showing compositional reasoning requires deeper processing
- Third promising case from temporal permutation analysis.
- Core E3 finding validating S as a predictor of anchoring effectiveness
- Connects this study's results to Schrimpf et al. 2021 and Caucheteux et al. 2022/2023 findings on brain-LLM alignment.
- LLaMA E3 geometry summary: S_max = −1.896 ± 0.211, AUS_N = −2.119 ± 0.198, peak layer ℓ* = 10 [IQR 0.384]finding0.780Seed-pooled geometry statistics for LLaMA in E3, providing quantitative basis for geometry-to-behavior correlate
- Empirical demonstration that MDVP produces divergent representations in a real LLM