TEM-t requires less time per gradient step than TEM

Empirical computational efficiency result comparing TEM-t to the original TEM implementation.

Source paper

extracted_from

Relating transformers to models and neural representations of the hippocampal formation

(2021) · James C. R. Whittington · Joseph W. Warren · Timothy E.J. Behrens

Neighborhood — ranked by edge-count

Frameworks (1)

framework

TEM-Transformer (TEM-t)
supports
The transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.

Related by similarity (8)

cosine ≥ 0.65 · no typed edge

Entities in the same semantic neighborhood but without a typed relation to this one — candidates for new edges or unrecognized duplicates.

TEM-t requires many fewer data samples than TEM to reach equivalent performance (sample efficiency improvement)finding0.841
Empirical performance comparison showing TEM-t is a more efficient learner than the original TEM.
TEM memory retrieval is mathematically equivalent to transformer self-attention without softmaxclaim0.768
Central theoretical claim: a single step of TEM attractor dynamics equals a dot-product attention, making TEM a special case of transformer.
TEM-t with linear activations learns grid-cell-like position encoding representations in 2D spatial environmentsfinding0.745
Empirical result showing TEM-t recapitulates entorhinal grid cell representations with linear post-transition activation.
TEM-t instantiates hippocampal indexing theory by using memory neurons to bind cortical representations across brain regionsclaim0.741
Theoretical claim linking the TEM-t architecture to the Teyler-Rudy hippocampal indexing theory.
TEM-t learns grid cells in hexagonal 6-connected worldsfinding0.737
Empirical extension showing grid cell learning generalises to non-4-connected spatial environments.
TEM-t learns band-cell-like position encoding representations resembling Krupic et al. band cellsfinding0.717
Empirical result showing TEM-t position encodings also recapitulate band cells, not just grid cells.
The step-by-step approach works. The all-or-nothing approach does not work. This is the secret of biological evolution.quote0.704
Crisp conclusion from the 30-coin thought experiment, linking adaptation in buildings to evolution.
TEM-t memory neurons show spatially-tuned firing resembling hippocampal place cells in each environmentfinding0.703
Empirical result demonstrating that the sparse softmax activation of memory neurons produces place-cell-like spatial tuning.