finding
active
finding:at-layer-12-the-layer-analyzed-by-burger-et-al-2024-tp-and-tg-explain-similar-fractions-of-truth-related-variance-0-33-eachAt layer 12 (the layer analyzed by Burger et al. 2024), tP and tG explain similar fractions of truth-related variance (~0.33 each).
Shows that Burger et al.'s layer choice corresponds to a transitional phase, not a universal property.
Source paper
extracted_from(2026) · Angelos Poulis · Mark Crovella · Evimaria Terzi
Neighborhood — ranked by edge-count
Claims (1)
claim
- Reinterpretation of Burger et al.'s finding as layer-specific rather than universal.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Variance decomposition showing the disentanglement of polarity from truth across model depth.
- Supported by the geometric transition visible in cosine similarity heatmaps for F0-F3.
- Truth-related directions reliably emerge at 60–75% of normalized layer depth in Qwen and Gemma modelsfinding0.771Experiment 1 finding localizing where truth can be causally mediated
- Methodological critique of prior work that fixed a single layer for truth probing.
- The middle layer residual stream features are causally implicated in multi-step reasoning.claim0.760Features for Kobe Bryant, California, Lakers participate in computing the capital answer.
- Demonstrates that early-layer probes capture sentence polarity rather than truth.
- Argues against the single-layer analysis approach of prior work.
- Third promising case from temporal permutation analysis.