claim
active
claim:tem-memory-retrieval-is-mathematically-equivalent-to-transformer-self-attention-without-softmaxTEM memory retrieval is mathematically equivalent to transformer self-attention without softmax
Central theoretical claim: a single step of TEM attractor dynamics equals a dot-product attention, making TEM a special case of transformer.
Source paper
extracted_from(2021) · James C. R. Whittington · Joseph W. Warren · Timothy E.J. Behrens
Neighborhood — ranked by edge-count
Papers (1)
paper
Findings (3)
finding
- Empirical result showing TEM-t recapitulates entorhinal grid cell representations with linear post-transition activation.
- TEM-t learns band-cell-like position encoding representations resembling Krupic et al. band cellssupportsEmpirical result showing TEM-t position encodings also recapitulate band cells, not just grid cells.
- Empirical extension showing grid cell learning generalises to non-4-connected spatial environments.
Claims (2)
claim
- Key structural correspondence claim linking the neuroscience model's spatial representation to ML concept of position encoding.
- Methodological clarification distinguishing this paper's contribution from looser representational similarity claims.
Questions (1)
question
- Motivating question from introduction that the TEM-transformer equivalence helps answer affirmatively.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Theoretical claim linking the TEM-t architecture to the Teyler-Rudy hippocampal indexing theory.
- Empirical performance comparison showing TEM-t is a more efficient learner than the original TEM.
- Empirical computational efficiency result comparing TEM-t to the original TEM implementation.
- The transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.
- TEM-t memory neurons show spatially-tuned firing resembling hippocampal place cells in each environmentfinding0.755Empirical result demonstrating that the sparse softmax activation of memory neurons produces place-cell-like spatial tuning.
- Antra's foundational claim about how introspection arises computationally rather than from memorised text.
- Neuroscience model of hippocampal formation that the paper shows is mathematically equivalent to a transformer with recurrent position encodings.
- Claim that capability emerges from architecture, not data, and that later models lose the surprise.