claim
active
claim:position-encodings-should-represent-location-in-a-learned-structure-inferred-on-the-fly-rather-than-fixed-sines-and-cosinesPosition encodings should represent location in a learned structure inferred on the fly rather than fixed sines and cosines
Novel interpretive claim about position encodings inspired by the TEM-transformer correspondence.
Source paper
extracted_from(2021) · James C. R. Whittington · Joseph W. Warren · Timothy E.J. Behrens
Neighborhood — ranked by edge-count
Papers (1)
paper
Concepts (1)
concept
- Position EncodingsextendsMechanism for encoding sequence order in transformers; paper argues these should reflect learned structural representations rather than fixed sines/cosines.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Forward-looking interpretive claim about the implications of recurrent position encodings for NLP research.
- Key modification to transformers proposed in this paper: position encodings generated by a recurrent network trained on action sequences.
- TEM-t learns band-cell-like position encoding representations resembling Krupic et al. band cellsfinding0.787Empirical result showing TEM-t position encodings also recapitulate band cells, not just grid cells.
- Hypothesis that in language tasks, the abstract structure encoded in positional encodings corresponds to grammatical structure.
- what is the analogue of spatial positional encodings for higher order tasks such as language?question0.771Open question raised in Discussion about extending TEM-t principles beyond spatial navigation.
- Neural Representations of Location Composed of Spatially Periodic Bands (Krupic et al., 2012)concept0.764Discovery of band cells; TEM-t also recapitulates these representations.
- Learning to encode position for transformer with continuous dynamical model (Liu et al., 2020)concept0.760Prior work on learned dynamic position encodings; cited alongside Wang et al. as precedent.
- Interpretive claim about what linear DAS results actually tell us