paper
active
2021
59
paper:doi-10-48550-arxiv-2112-04035

Relating transformers to models and neural representations of the hippocampal formation

TL;DR

Transformers equipped with recurrent position encodings spontaneously learn grid cells, band cells, and place cell-like representations when trained on sequential spatial prediction tasks—representations that match those recorded empirically in rodent medial entorhinal cortex (Hafting et al., 2005) and hippocampus. The paper's central contribution is the TEM-Transformer (TEM-t), a modified transformer architecture derived by proving a formal mathematical equivalence between the Tolman-Eichenbaum Machine (TEM; Whittington et al., 2020) and standard transformer self-attention: TEM's Hopfield-network memory retrieval is shown to reduce to dot-product attention (without softmax scaling), while TEM's path-integration recurrence over action-dependent weight matrix **W_a** maps exactly onto learned recurrent position encodings. The three architectural modifications—restricting keys/queries to position encodings, restricting values to stimulus representations, and making position encodings recurrently learnable—are sufficient to recover biologically observed spatial tuning. TEM-t reaches full training performance in under 20,000 gradient steps whereas TEM requires up to 50,000, and scales to substantially larger memory stores. Memory neurons in TEM-t, implementing the softmax step of attention, exhibit sparse spatial tuning that remaps randomly across environments, consistent with hippocampal place cell phenomenology. The paper argues this equivalence implies that (1) hippocampal indexing theory (Teyler & Rudy, 2007) is mechanistically instantiated by transformer self-attention, (2) learned recurrent position encodings reflecting task structure—rather than fixed sinusoidal encodings—represent a principled and potentially superior alternative for language and other cognitive domains, and (3) neocortical circuits performing language comprehension may implement transformer-like computations with cortical memory neurons substituting for hippocampus.

What to take away

  1. 1. A transformer with three specific modifications—keys/queries restricted to recurrent position encodings, values restricted to sensory stimuli, and a learnable action-dependent recurrent update e_{t+1} = σ(e_t W_a)—learns grid cells, band cells, and place-like representations matching empirically recorded hippocampal formation neurons.
  2. 2. TEM-t reaches convergent zero-shot spatial prediction performance in fewer than 20,000 gradient steps, whereas the original TEM model requires up to 50,000 gradient steps, representing a greater-than-2.5× improvement in sample efficiency.
  3. 3. The Tolman-Eichenbaum Machine's Hopfield-network memory retrieval step reduces algebraically to transformer self-attention without softmax scaling: q_t M_t = Σ_τ [q_t · p_τ] p_τ, establishing a formal mathematical—not merely representational—equivalence between TEM and transformers.
  4. 4. TEM's path-integration recurrence g_{t+1} = σ(g_t W_a) is mathematically identical in form to TEM-t's recurrent position encoding update, meaning entorhinal grid-cell representations play the functional role of positional encodings in the transformer framework.
  5. 5. Memory neurons in TEM-t, which compute the softmax over dot-products between the current query and stored key vectors, display spatially tuned, place-cell-like firing that remaps randomly between environments, consistent with established hippocampal place cell phenomenology (O'Keefe & Dostrovsky, 1971).
  6. 6. The paper replicates grid cells with both linear and ReLu post-transition activation functions (grid score threshold 0.3–0.5 used for classification), and also reproduces band cells (Krupic et al., 2012) as a distinct learned representation class.
  7. 7. TEM-t architecturally instantiates hippocampal indexing theory (Teyler & Rudy, 2007): hippocampal memory neurons bind together factorised cortical representations from medial entorhinal cortex (g̃) and lateral entorhinal cortex (x̃), and any subset of those representations can reinstate the others via pattern completion.
  8. 8. Extending TEM-t to triple conjunctions requires only n_c additional feature neurons per new brain region while the number of hippocampal memory neurons remains constant, in contrast to naive TEM where the hippocampal neuron count scales multiplicatively with each additional cortical region.
  9. 9. An open hypothesis raised is that positional encodings for language transformers should reflect learned grammatical structure inferred on-the-fly rather than fixed sinusoidal encodings, by analogy with how spatial structure is encoded via path integration in TEM-t.
  10. 10. As a replicable methodology, the authors train on sequences drawn from multiple 4-connected 2D graph environments sharing identical Euclidean structure but with randomly reassigned (non-unique, one-hot) sensory observations at each node, isolating transition structure as the sole driver of learned representations and enabling zero-shot transfer to novel environments.

Peer brief — for seminar discussion

Whittington et al. (ICLR 2022) ask whether the transformer architecture—developed with no neuroscientific motivation—is formally related to bespoke neuroscience models of the hippocampal formation, and whether this relationship explains why transformers with a small modification learn biological spatial representations. To answer this, they introduce the TEM-Transformer (TEM-t), built by proving a step-by-step algebraic equivalence between the Tolman-Eichenbaum Machine (TEM; Whittington et al., 2020, Cell) and standard self-attention: TEM's Hopfield attractor memory retrieval reduces to dot-product attention, and TEM's path-integration recurrence g_{t+1} = σ(g_t W_a) is structurally identical to a learned recurrent position encoding. Three architectural modifications to a standard causal transformer—restricting keys and queries to position encodings, restricting values to sensory stimuli, and making position encodings recurrently learnable via an action-dependent weight matrix—are sufficient for TEM-t to reproduce grid cells, band cells (Krupic et al., 2012), and place-cell-like representations on a sequential spatial prediction task across multiple 4-connected 2D graph environments with randomly assigned one-hot sensory observations. The load-bearing finding is twofold. First, TEM-t is not merely behaviourally similar to TEM but is a formal mathematical reparameterisation of it, with path-integrated representations g playing the role of positional encodings and Hebbian conjunctive memories p playing the role of key-value pairs. Second, TEM-t achieves the same spatial generalisation as TEM in under 20,000 gradient steps versus up to 50,000 for TEM, a greater-than-2.5× gain in sample efficiency, while also scaling to larger memory stores. The memory neurons of TEM-t—computing the softmax over key-query dot products—exhibit spatially localised, randomly remapping activity consistent with hippocampal place cells (O'Keefe & Dostrovsky, 1971), and the model instantiates hippocampal indexing theory (Teyler & Rudy, 2007) by having memory neurons bind together factorised MEC (g̃) and LEC (x̃) representations. An alternative approach would have been to use standard sinusoidal positional encodings and test whether grid-like representations still emerge; the paper's recurrent encoding is the key manipulated variable, and the contrast against fixed encodings is implicit rather than experimentally ablated. The implications extend beyond spatial cognition: because transformer representations predict language-area BOLD responses (Schrimpf et al., 2020) and patients with major hippocampal damage retain language comprehension (Elward & Vargha-Khadem, 2018), the paper proposes that neocortical circuits may implement TEM-t-like computations with cortical memory neurons replacing hippocampus, and that grammatical structure should function as the positional encoding analogue for language—a hypothesis left unverified. A critical reader would push back on the scope of the performance comparison: the authors explicitly acknowledge they used the original TEM codebase without optimisation for speed or sample efficiency, making the 2.5×+ efficiency gain difficult to interpret as an intrinsic architectural advantage rather than an implementation artefact. The paper flags this limitation but still characterises the difference as 'stark.' Additionally, the spatial environments used are highly idealised—one-hot sensory observations with no inter-location correlations, 4-connected graphs—and it is unclear whether TEM-t's representations remain as interpretable or biologically faithful in richer, higher-dimensional sensory settings more representative of actual hippocampal inputs.

Methods (3)

Frameworks (2)

Findings (7)

Claims (8)

Hypotheses (3)

Questions (4)

Original abstract (expand)

Many deep neural network architectures loosely based on brain networks have recently been shown to replicate neural firing patterns observed in the brain. One of the most exciting and promising novel architectures, the Transformer neural network, was developed without the brain in mind. In this work, we show that transformers, when equipped with recurrent position encodings, replicate the precisely tuned spatial representations of the hippocampal formation; most notably place and grid cells. Furthermore, we show that this result is no surprise since it is closely related to current hippocampal models from neuroscience. We additionally show the transformer version offers dramatic performance gains over the neuroscience version. This work continues to bind computations of artificial and brain networks, offers a novel understanding of the hippocampal-cortical interaction, and suggests how wider cortical areas may perform complex tasks beyond current neuroscience models such as language comprehension.

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

+28 more

Similar preprints — Semantic Scholar