concept
active
concept:key-query-attentionKey-query attention
Attention mechanism in AI used in Transformers; also proposed for GWT implementations.
Neighborhood — ranked by edge-count
Papers (1)
paper
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- A predictive model representing and controlling attention; central to attention schema theory.
- Process using Q, K, V to compute a heat map over K and weighted sum of V.
- An Elephant action of answering a query.
- Key architectural modification restricting queries and keys to position encodings while values depend only on stimuli; extreme version of best-practice insight.
- A pair of query and key subcomponents distributed across attention heads performs previous-token behaviorfinding0.727VPD recovers an attention algorithm for attending to the previous token, distributed across multiple heads.
- The highly specific fit between a new building volume and the existing configuration, analogous to protein-ligand binding.
- Original transformer paper; foundational reference cited throughout for the architecture being analyzed.
- Reframing observation: the canonical K/Q/V decomposition is computationally convenient but not the most interpretable representation