finding
active
finding:tem-t-requires-less-time-per-gradient-step-than-temTEM-t requires less time per gradient step than TEM
Empirical computational efficiency result comparing TEM-t to the original TEM implementation.
Source paper
extracted_from(2021) · James C. R. Whittington · Joseph W. Warren · Timothy E.J. Behrens
Neighborhood — ranked by edge-count
Frameworks (1)
framework
- TEM-Transformer (TEM-t)supportsThe transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.
Related by similarity (8)
cosine ≥ 0.65 · no typed edgeEntities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.
- Empirical performance comparison showing TEM-t is a more efficient learner than the original TEM.
- TEM memory retrieval is mathematically equivalent to transformer self-attention without softmaxclaim0.768Central theoretical claim: a single step of TEM attractor dynamics equals a dot-product attention, making TEM a special case of transformer.
- Empirical result showing TEM-t recapitulates entorhinal grid cell representations with linear post-transition activation.
- Theoretical claim linking the TEM-t architecture to the Teyler-Rudy hippocampal indexing theory.
- Empirical extension showing grid cell learning generalises to non-4-connected spatial environments.
- TEM-t learns band-cell-like position encoding representations resembling Krupic et al. band cellsfinding0.717Empirical result showing TEM-t position encodings also recapitulate band cells, not just grid cells.
- Crisp conclusion from the 30-coin thought experiment, linking adaptation in buildings to evolution.
- TEM-t memory neurons show spatially-tuned firing resembling hippocampal place cells in each environmentfinding0.703Empirical result demonstrating that the sparse softmax activation of memory neurons produces place-cell-like spatial tuning.