Timothy E.J. Behrens

orcid 0000-0003-0048-1177 openalex A5032723566 name_hash 592d8b7df58f70c55cc8ed71…

Authored

Introduces

Studies

Affiliations

Cited by

Authored papers (1)

Relating transformers to models and neural representations of the hippocampal formation2021ⓒ 59
Transformers equipped with recurrent position encodings spontaneously learn grid cells, band cells, and place cell-like representations when trained on sequential spatial prediction tasks—representations that match those recorded empirically in rodent medial entorhinal cortex (Hafting et al., 2005) and hippocampus. The paper's central contribution is the TEM-Transformer (TEM-t), a modified transformer architecture derived by proving a formal mathematical equivalence between the Tolman-Eichenbaum Machine (TEM; Whittington et al., 2020) and standard transformer self-attention: TEM's Hopfield-network memory retrieval is shown to reduce to dot-product attention (without softmax scaling), while TEM's path-integration recurrence over action-dependent weight matrix **W_a** maps exactly onto learned recurrent position encodings. The three architectural modifications—restricting keys/queries to position encodings, restricting values to stimulus representations, and making position encodings recurrently learnable—are sufficient to recover biologically observed spatial tuning. TEM-t reaches full training performance in under 20,000 gradient steps whereas TEM requires up to 50,000, and scales to substantially larger memory stores. Memory neurons in TEM-t, implementing the softmax step of attention, exhibit sparse spatial tuning that remaps randomly across environments, consistent with hippocampal place cell phenomenology. The paper argues this equivalence implies that (1) hippocampal indexing theory (Teyler & Rudy, 2007) is mechanistically instantiated by transformer self-attention, (2) learned recurrent position encodings reflecting task structure—rather than fixed sinusoidal encodings—represent a principled and potentially superior alternative for language and other cognitive domains, and (3) neocortical circuits performing language comprehension may implement transformer-like computations with cortical memory neurons substituting for hippocampus.

Timothy E.J. Behrens

Authored papers (1)

More papers — OpenAlex / S2

Affiliations (2)

Co-authors (4)

Recent mentions (1)