finding
active
finding:in-the-analyzed-two-layer-model-second-layer-attention-head-terms-dominate-the-loss-reduction-compared-to-first-layer-terms-and-the-direct-pathIn the analyzed two-layer model, second-layer attention head terms dominate the loss reduction compared to first-layer terms and the direct path
Result from term importance analysis breaking down loss contribution by layer
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Result of term importance analysis ablation experiment; justifies focusing on individual head terms
- Empirical observation from examining expanded OV/QK matrices; approximately 10 out of 12 heads show significant copying
- Finding from term importance analysis; allows focus on individual head terms rather than their compositions
- Result from applying the Frobenius norm composition measurement to all attention head pairs in the two-layer model
- Striking mechanistic finding that injection creates universally detectable perturbation in residual stream immediately downstream
- Concrete example from examining expanded QK/OV matrices showing how specific programming language structure is encoded in attention weights
- Core claim for two-layer models; composition creates qualitatively more powerful in-context learning
- Mathematical equivalence enabling independent analysis of each attention head