claim
active
claim:key-query-and-value-vectors-are-intermediary-byproducts-w-ov-and-w-qk-are-the-fundamental-low-rank-matrices-describing-attention-head-behaviorKey, query, and value vectors are intermediary byproducts; W_OV and W_QK are the fundamental low-rank matrices describing attention head behavior
Reframing observation: the canonical K/Q/V decomposition is computationally convenient but not the most interpretable representation
Neighborhood — ranked by edge-count
Claims (1)
claim
- Key decomposition enabling separate analysis of where attention goes and what it does
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- What matrix decomposition or dimensionality reduction best summarizes the enormous low-rank OV and QK matrices?question0.771Open methodological question about converting the 50k x 50k expanded matrices into human-graspable summaries
- A pair of query and key subcomponents distributed across attention heads performs previous-token behaviorfinding0.770VPD recovers an attention algorithm for attending to the previous token, distributed across multiple heads.
- Proposed explanation for why single-turn reformulation improves performance: models' training distribution is concentrated on single-turn reasoning.
- Response to the 'attention as explanation' critique; the paper provides a typology of when attention is and isn't directly interpretable
- Finding from term importance analysis; allows focus on individual head terms rather than their compositions
- Janus's interpretive model for how attention mechanisms enable deliberate information flow and selective routing.
- Suggests LLMs do not represent complement/MSV linguistic features in the same way as they are crucial for human ToM development.
- Steering vectors from µ(0→2) slightly outperform µ(1→2) for instruction discovery across datasets and modelsfinding0.746Shows that contrasting No Reflection with Triggered Reflection provides a stronger signal than Intrinsic vs Triggered.