quote
active
quote:we-revealed-the-one-layer-attention-only-model-to-be-a-compressed-chinese-room-and-we-re-left-with-a-giant-pile-of-cardsWe revealed the one-layer attention-only model to be a compressed Chinese room, and we're left with a giant pile of cards.
Vivid characterization of the limits of understanding after converting to skip-trigram form: no algorithmic mystery remains but the sheer scale prevents holistic comprehension
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Empirical observation from examining expanded OV/QK matrices; approximately 10 out of 12 heads show significant copying
- Concrete example from examining expanded QK/OV matrices showing how specific programming language structure is encoded in attention weights
- Result from applying the Frobenius norm composition measurement to all attention head pairs in the two-layer model
- Result from term importance analysis breaking down loss contribution by layer
- Quantitative result from eigenvalue analysis of expanded OV matrices; confirmed by qualitative inspection
- Exploratory interpretation of Chinese model performance under contemplative prompt
- Finding from term importance analysis; allows focus on individual head terms rather than their compositions
- Core claim for one-layer models; the skip-trigram tables can be accessed without running the model