intention checking peaks at ~1/2 depth in transformers

Lindsey (2026) found that intention checking accuracy peaks around half the network depth.

Source paper

extracted_from

Janus Information Flow Transformers 2025

Neighborhood — ranked by edge-count

Claims (1)

claim

Different introspective tasks may preferentially use different path distributions in the transformer.
supports
Interpretive claim connecting exponential path combinatorics to Lindsey's layer-dependent findings.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

thought detection peaks at ~2/3 depth in transformersfinding0.857
Lindsey (2026) found that thought detection accuracy is highest around two-thirds of the network depth.
Thought detection peaks at ~2/3 layer depth; intention checking peaks at ~1/2 layer depth.finding0.847
Lindsey (2026) differential layer performance explained by Janus's path combinatorics — different tasks use different path distributions.
The last layer of the transformer has the largest projection magnitude on the reflection direction, likely because it directly controls generation of reflection keywordsclaim0.742
Interpretive claim from attention head attribution analysis in appendix
Truth-related directions reliably emerge at 60–75% of normalized layer depth in Qwen and Gemma modelsfinding0.735
Experiment 1 finding localizing where truth can be causally mediated
Triggered Reflection with 'Alternatively' achieves accuracy .684 on gsm8k_adv for Gemma3-4B-ITfinding0.733
Highest single-instruction accuracy result in the paper.
Redundant information paths create interference patterns, so transformers likely experience memory and cognition as interferometric and continuous.claim0.733
Janus's claim linking path redundancy to interferometric phenomenology.
Prefill detection effect peaks at an earlier layer (slightly over halfway through) in Opus 4.1, different from injected thoughts peakfinding0.732
The optimal layer for the prefill introspection differs from the optimal layer for detecting injected thoughts.
Two-layer attention-only transformers implement much more complex algorithms via composition of attention heads, detectable directly from weightsclaim0.730
Core claim for two-layer models; composition creates qualitatively more powerful in-context learning