claim
active
claim:the-last-layer-of-the-transformer-has-the-largest-projection-magnitude-on-the-reflection-direction-likely-because-it-directly-controls-generation-of-reflection-keywordsThe last layer of the transformer has the largest projection magnitude on the reflection direction, likely because it directly controls generation of reflection keywords
Interpretive claim from attention head attribution analysis in appendix
Source paper
extracted_from(2025) · Ge Yan · Sun, Chung-En · Tsui-Wei · Weng
Neighborhood — ranked by edge-count
Findings (1)
finding
- Attribution finding suggesting the last layer directly controls reflection keyword generation
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Reflection-inducing directions emerge more clearly in higher layers (ℓ>5) for both models and datasetsfinding0.793Empirical observation about which network layers encode reflection-relevant information.
- Feature extraction method computing cosine similarity of hidden representations with reflection direction across all layers
- Interpretive claim connecting exponential path combinatorics to Lindsey's layer-dependent findings.
- Supported by the geometric transition visible in cosine similarity heatmaps for F0-F3.
- Core claim of ReflCtrl that a single direction captures and controls reflection
- Claim formalizing the Anima Labs idea that transformers are effectively recurrent due to K/V stream.
- Interpretive claim about the locus of reflection in transformer architecture.
- Argues against the single-layer analysis approach of prior work.