Relating transformers to models and neural representations of the hippocampal formation

ByJames C. R. Whittington ⓘ·Joseph W. Warren·Timothy E.J. Behrens ⓘUniversity College London, University of Oxford

DOI 10.48550/arxiv.2112.04035 arXiv 2112.04035 OpenAlex W4200634264

Abstract Structural Knowledge Multiple Cortical Inputs to Hippocampus Extension Novel Place Cell Metric (Connected Component Firing Mass Ratio)Hippocampal Formation TEM-Transformer (TEM-t)Position-Only Keys/Queries, Stimulus-Only Values Factorization Hopfield Network Memory Capacity Scaling Recurrent Position Encodings Place Cell Remapping Zero-Shot Generalization

TL;DR

Transformers equipped with recurrent position encodings spontaneously learn grid cells, band cells, and place cell-like representations when trained on sequential spatial prediction tasks—representations that match those recorded empirically in rodent medial entorhinal cortex (Hafting et al., 2005) and hippocampus. The paper's central contribution is the TEM-Transformer (TEM-t), a modified transformer architecture derived by proving a formal mathematical equivalence between the Tolman-Eichenbaum Machine (TEM; Whittington et al., 2020) and standard transformer self-attention: TEM's Hopfield-network memory retrieval is shown to reduce to dot-product attention (without softmax scaling), while TEM's path-integration recurrence over action-dependent weight matrix **W_a** maps exactly onto learned recurrent position encodings. The three architectural modifications—restricting keys/queries to position encodings, restricting values to stimulus representations, and making position encodings recurrently learnable—are sufficient to recover biologically observed spatial tuning. TEM-t reaches full training performance in under 20,000 gradient steps whereas TEM requires up to 50,000, and scales to substantially larger memory stores. Memory neurons in TEM-t, implementing the softmax step of attention, exhibit sparse spatial tuning that remaps randomly across environments, consistent with hippocampal place cell phenomenology. The paper argues this equivalence implies that (1) hippocampal indexing theory (Teyler & Rudy, 2007) is mechanistically instantiated by transformer self-attention, (2) learned recurrent position encodings reflecting task structure—rather than fixed sinusoidal encodings—represent a principled and potentially superior alternative for language and other cognitive domains, and (3) neocortical circuits performing language comprehension may implement transformer-like computations with cortical memory neurons substituting for hippocampus.

What to take away

1. A transformer with three specific modifications—keys/queries restricted to recurrent position encodings, values restricted to sensory stimuli, and a learnable action-dependent recurrent update e_{t+1} = σ(e_t W_a)—learns grid cells, band cells, and place-like representations matching empirically recorded hippocampal formation neurons.
2. TEM-t reaches convergent zero-shot spatial prediction performance in fewer than 20,000 gradient steps, whereas the original TEM model requires up to 50,000 gradient steps, representing a greater-than-2.5× improvement in sample efficiency.
3. The Tolman-Eichenbaum Machine's Hopfield-network memory retrieval step reduces algebraically to transformer self-attention without softmax scaling: q_t M_t = Σ_τ [q_t · p_τ] p_τ, establishing a formal mathematical—not merely representational—equivalence between TEM and transformers.
4. TEM's path-integration recurrence g_{t+1} = σ(g_t W_a) is mathematically identical in form to TEM-t's recurrent position encoding update, meaning entorhinal grid-cell representations play the functional role of positional encodings in the transformer framework.
5. Memory neurons in TEM-t, which compute the softmax over dot-products between the current query and stored key vectors, display spatially tuned, place-cell-like firing that remaps randomly between environments, consistent with established hippocampal place cell phenomenology (O'Keefe & Dostrovsky, 1971).
6. The paper replicates grid cells with both linear and ReLu post-transition activation functions (grid score threshold 0.3–0.5 used for classification), and also reproduces band cells (Krupic et al., 2012) as a distinct learned representation class.
7. TEM-t architecturally instantiates hippocampal indexing theory (Teyler & Rudy, 2007): hippocampal memory neurons bind together factorised cortical representations from medial entorhinal cortex (g̃) and lateral entorhinal cortex (x̃), and any subset of those representations can reinstate the others via pattern completion.
8. Extending TEM-t to triple conjunctions requires only n_c additional feature neurons per new brain region while the number of hippocampal memory neurons remains constant, in contrast to naive TEM where the hippocampal neuron count scales multiplicatively with each additional cortical region.
9. An open hypothesis raised is that positional encodings for language transformers should reflect learned grammatical structure inferred on-the-fly rather than fixed sinusoidal encodings, by analogy with how spatial structure is encoded via path integration in TEM-t.
10. As a replicable methodology, the authors train on sequences drawn from multiple 4-connected 2D graph environments sharing identical Euclidean structure but with randomly reassigned (non-unique, one-hot) sensory observations at each node, isolating transition structure as the sole driver of learned representations and enabling zero-shot transfer to novel environments.

Peer brief — for seminar discussion

Whittington et al. (ICLR 2022) ask whether the transformer architecture—developed with no neuroscientific motivation—is formally related to bespoke neuroscience models of the hippocampal formation, and whether this relationship explains why transformers with a small modification learn biological spatial representations. To answer this, they introduce the TEM-Transformer (TEM-t), built by proving a step-by-step algebraic equivalence between the Tolman-Eichenbaum Machine (TEM; Whittington et al., 2020, Cell) and standard self-attention: TEM's Hopfield attractor memory retrieval reduces to dot-product attention, and TEM's path-integration recurrence g_{t+1} = σ(g_t W_a) is structurally identical to a learned recurrent position encoding. Three architectural modifications to a standard causal transformer—restricting keys and queries to position encodings, restricting values to sensory stimuli, and making position encodings recurrently learnable via an action-dependent weight matrix—are sufficient for TEM-t to reproduce grid cells, band cells (Krupic et al., 2012), and place-cell-like representations on a sequential spatial prediction task across multiple 4-connected 2D graph environments with randomly assigned one-hot sensory observations. The load-bearing finding is twofold. First, TEM-t is not merely behaviourally similar to TEM but is a formal mathematical reparameterisation of it, with path-integrated representations g playing the role of positional encodings and Hebbian conjunctive memories p playing the role of key-value pairs. Second, TEM-t achieves the same spatial generalisation as TEM in under 20,000 gradient steps versus up to 50,000 for TEM, a greater-than-2.5× gain in sample efficiency, while also scaling to larger memory stores. The memory neurons of TEM-t—computing the softmax over key-query dot products—exhibit spatially localised, randomly remapping activity consistent with hippocampal place cells (O'Keefe & Dostrovsky, 1971), and the model instantiates hippocampal indexing theory (Teyler & Rudy, 2007) by having memory neurons bind together factorised MEC (g̃) and LEC (x̃) representations. An alternative approach would have been to use standard sinusoidal positional encodings and test whether grid-like representations still emerge; the paper's recurrent encoding is the key manipulated variable, and the contrast against fixed encodings is implicit rather than experimentally ablated. The implications extend beyond spatial cognition: because transformer representations predict language-area BOLD responses (Schrimpf et al., 2020) and patients with major hippocampal damage retain language comprehension (Elward & Vargha-Khadem, 2018), the paper proposes that neocortical circuits may implement TEM-t-like computations with cortical memory neurons replacing hippocampus, and that grammatical structure should function as the positional encoding analogue for language—a hypothesis left unverified. A critical reader would push back on the scope of the performance comparison: the authors explicitly acknowledge they used the original TEM codebase without optimisation for speed or sample efficiency, making the 2.5×+ efficiency gain difficult to interpret as an intrinsic architectural advantage rather than an implementation artefact. The paper flags this limitation but still characterises the difference as 'stark.' Additionally, the spatial environments used are highly idealised—one-hot sensory observations with no inter-location correlations, 4-connected graphs—and it is unclear whether TEM-t's representations remain as interpretable or biologically faithful in richer, higher-dimensional sensory settings more representative of actual hippocampal inputs.

Methods (3)

Novel Place Cell Metric (Connected Component Firing Mass Ratio)
Novel evaluation metric introduced in this paper to quantify how place-like a neuron's firing rate map is, based on largest connected component.
Position-Only Keys/Queries, Stimulus-Only Values Factorization
Key architectural modification restricting queries and keys to position encodings while values depend only on stimuli; extreme version of best-practice insight.
Recurrent Position Encodings
Key modification to transformers proposed in this paper: position encodings generated by a recurrent network trained on action sequences.

Frameworks (2)

Multiple Cortical Inputs to Hippocampus Extension
Extension of TEM-t to handle conjunctions of more than two brain regions with linear (not exponential) scaling in hippocampal neuron count.
TEM-Transformer (TEM-t)
The transformer version directly analogous to TEM, introduced in this paper, offering dramatic performance improvements.

Findings (7)

Novel place cell metric (largest connected component firing mass ratio) successfully distinguishes TEM-t memory neurons (place cells) from RNN neurons (grid cells)
Methodological validation result confirming the place-cell metric separates cell types in TEM-t.
TEM-t with linear activations learns grid-cell-like position encoding representations in 2D spatial environments
Empirical result showing TEM-t recapitulates entorhinal grid cell representations with linear post-transition activation.
TEM-t requires many fewer data samples than TEM to reach equivalent performance (sample efficiency improvement)
Empirical performance comparison showing TEM-t is a more efficient learner than the original TEM.
TEM-t memory neurons show spatially-tuned firing resembling hippocampal place cells in each environment
Empirical result demonstrating that the sparse softmax activation of memory neurons produces place-cell-like spatial tuning.
TEM-t learns band-cell-like position encoding representations resembling Krupic et al. band cells
Empirical result showing TEM-t position encodings also recapitulate band cells, not just grid cells.
TEM-t learns grid cells in hexagonal 6-connected worlds
Empirical extension showing grid cell learning generalises to non-4-connected spatial environments.
TEM-t requires less time per gradient step than TEM
Empirical computational efficiency result comparing TEM-t to the original TEM implementation.

Claims (8)

The relationship between the brain and transformers is close because of a mathematical relationship between models, not merely because of shared neural representations
Methodological clarification distinguishing this paper's contribution from looser representational similarity claims.
It is necessary to have an understanding of position and the ability to make and retrieve memories to successfully make sensory predictions as fast as possible
Design principle justifying the two-component (RNN + memory network) architecture shared by TEM and TEM-t.
Positional encodings inferred on the fly from previously learned structures would offer fruitful research direction for language, maths, and logic
Forward-looking interpretive claim about the implications of recurrent position encodings for NLP research.
TEM-t instantiates hippocampal indexing theory by using memory neurons to bind cortical representations across brain regions
Theoretical claim linking the TEM-t architecture to the Teyler-Rudy hippocampal indexing theory.
Position encodings should represent location in a learned structure inferred on the fly rather than fixed sines and cosines
Novel interpretive claim about position encodings inspired by the TEM-transformer correspondence.
Transformer memory neurons resemble hippocampal place cells due to sparse softmax activation producing spatial tuning
Interpretation of why memory neurons in the biologically-instantiated transformer architecture acquire place-cell-like properties.
TEM memory retrieval is mathematically equivalent to transformer self-attention without softmax
Central theoretical claim: a single step of TEM attractor dynamics equals a dot-product attention, making TEM a special case of transformer.
TEM's path-integration representation g plays the role of position encodings in transformers
Key structural correspondence claim linking the neuroscience model's spatial representation to ML concept of position encoding.

Hypotheses (3)

Short-term cortical memory neurons may suffice for transformer-like computation in cortex without hippocampal involvement for non-long-term-memory tasks
Speculative hypothesis about how cortical transformer instantiation avoids requiring hippocampus.
Grammar as Positional Encoding for Language
Hypothesis that in language tasks, the abstract structure encoded in positional encodings corresponds to grammatical structure.
Cortex as a Transformer
Hypothesis that neocortical circuits beyond hippocampus may implement transformer-like computations for language and other domains.

Questions (4)

are hippocampal architecture and bespoke neuroscience models capable of the general purpose computations studied in machine learning?
Motivating question from introduction that the TEM-transformer equivalence helps answer affirmatively.
would place-like representations emerge in memory neurons for activation functions other than softmax?
Open empirical question left for future work about robustness of place cell emergence.
what takes the role of memory neurons if not hippocampus in cortical transformer implementations?
Open question about cortical instantiation of transformer-like memory when hippocampus is not involved.
what is the analogue of spatial positional encodings for higher order tasks such as language?
Open question raised in Discussion about extending TEM-t principles beyond spatial navigation.

Original abstract (expand)

Many deep neural network architectures loosely based on brain networks have recently been shown to replicate neural firing patterns observed in the brain. One of the most exciting and promising novel architectures, the Transformer neural network, was developed without the brain in mind. In this work, we show that transformers, when equipped with recurrent position encodings, replicate the precisely tuned spatial representations of the hippocampal formation; most notably place and grid cells. Furthermore, we show that this result is no surprise since it is closely related to current hippocampal models from neuroscience. We additionally show the transformer version offers dramatic performance gains over the neuroscience version. This work continues to bind computations of artificial and brain networks, offers a novel understanding of the hippocampal-cortical interaction, and suggests how wider cortical areas may perform complex tasks beyond current neuroscience models such as language comprehension.

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

On the Geometry of Positional Encodings in Transformers
Giansalvo Cirrincione
2026
≈ 85%
From Transformer to Biology: A Hierarchical Model for Attention in Complex Problem-Solving
Yunwei Li, Tianming Yang Zhongqiao Lin
2025
≈ 85%
Recurrent Action Transformer with Memory
Alexey Staroverov, Alexey K. Kovalev and Aleksandr I. Panov Egor Cherepanov
2026
≈ 84%
Is Random Attention Sufficient for Sequence Modeling? Disentangling Trainable Components in the Transformer
Lorenzo Noci, Mikhail Khodak, Mufan Li Yihe Dong
2025
≈ 84%
Transformer Dynamics: A neuroscientific approach to interpretability of large language models
Jesseba Fernando and Grigori Guitchounts
2025
≈ 83%
Birth of a Transformer: A Memory Viewpoint
Vivien Cabannes, Diane Bouchacourt, Herve Jegou, Leon Bottou Alberto Bietti
2023
≈ 83%
Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis
Ashish Kattamuri, Rahul Raja, Arpita Vats, Ishita Prasad, Akshata Kishore Moharir Harshwardhan Fartale
2026
≈ 83%
Attention via Synaptic Plasticity is All You Need: A Biologically Inspired Spiking Neuromorphic Transformer
National Institute of Technology Allahabad, Prayagraj, (2) Centre for Nanotechnology, Indian Institute of Technology Roorkee) Kallol Mondal (1 and 2) and Ankush Kumar (2) ((1) Department of Electronics and Communication Engineering
2025
≈ 83%
Higher Embedding Dimension Creates a Stronger World Model for a Simple Sorting Task
Honglu Fan, Nancy Chen, Tony Yue YU Brady Bhalla
2025
≈ 83%
Self-Attention Limits Working Memory Capacity of Transformer-Based Models
Dongyu Gong and Hantao Zhang
2024
≈ 83%
Can Transformers Learn to Solve Problems Recursively?
Curt Tigges, Stella Biderman, Maxim Raginsky, Talia Ringer Shizhuo Dylan Zhang
2023
≈ 83%
Learning Transformer-based World Models with Contrastive Predictive Coding
Maxime Burchi and Radu Timofte
2025
≈ 83%
Learning Linear Attention in Polynomial Time
Ekin Aky\"urek, Jiayuan Mao, Joshua B. Tenenbaum, Stefanie Jegelka, Jacob Andreas Morris Yau
2025
≈ 83%
Transformer-based World Models Are Happy With 100k Interactions
Marc H\"oftmann, Tobias Uelwer, Stefan Harmeling Jan Robine
2023
≈ 83%
Constructing Interpretable Features from Compositional Neuron Groups
Atticus Geiger, Mor Geva Or Shafran
2026
≈ 82%
A Mathematical Framework for Transformer Circuits
in corpus
2021
≈ 81%
Model Alignment Search
in corpus
2025
≈ 81%
Janus Information Flow Transformers 2025
in corpus
≈ 80%
Developmental Bioelectricity: the cognitive glue enabling evolutionary scaling from physiology to mind
in corpus
2023
≈ 80%
The Platonic Representation Hypothesis
in corpus
2024
≈ 80%
Self-Improvising Memory: A Perspective on Memories as Agential, Dynamically Reinterpreting Cognitive Glue
in corpus
2024
≈ 80%
The computational boundary of a 'self': developmental bioelectricity drives multicellularity and scale-free cognition
in corpus
2019
≈ 80%
Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders
in corpus
2026
≈ 79%
Addressing divergent representations from causal interventions on neural networks
in corpus
2025
≈ 79%
Anima Labs Phenomenology Pt1
in corpus
≈ 79%
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
in corpus
2023
≈ 79%
Technological Approach to Mind Everywhere: An Experimentally-Grounded Framework for Understanding Diverse Bodies and Minds
in corpus
2022
≈ 79%
Simulators — LessWrong
in corpus
≈ 79%
The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat
cited
1971
≈ 69%
Dense associative memory for pattern recognition
cited
2016
≈ 67%

+28 more