Recurrent neural networks learn to store and generate sequences using non-linear representations

ByRóbert Csordás·Christopher Potts·Christopher D Manning·Atticus Geiger

DOI 10.48550/arxiv.2408.10920 arXiv 2408.10920

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

Bayesian Neural Networks: An Introduction and Survey
Ethan Goan and Clinton Fookes
2026
≈ 79%
Recurrent World Models Facilitate Policy Evolution
David Ha and J\"urgen Schmidhuber
2018
≈ 78%
Neural networks leverage nominally quantum and post-quantum representations
Paul M. Riechers and Thomas J. Elliott and Adam S. Shai
2025
≈ 78%
Discrete Latent Structure in Neural Networks
Caio F. Corro, Nikita Nangia, Tsvetomila Mihaylova, Andr\'e F. T. Martins Vlad Niculae
2026
≈ 78%
Representation Learning on a Random Lattice
Aryeh Brill
2025
≈ 78%
A unified theory of learning
Taisuke Katayose
2022
≈ 78%
Mechanistic Interpretability of RNNs emulating Hidden Markov Models
Michele Viscione, Lucas Pompe, Benjamin F Grewe, Valerio Mante Elia Torre
2025
≈ 77%
Dynamical similarity analysis can identify compositional dynamics developing in RNNs
Micha{\l} W\'ojcik, Jascha Achterberg, Rui Ponte Costa Quentin Guilhot
2024
≈ 76%
Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting
Ankur Mali, Daniel Kifer, C. Lee Giles Alexander Ororbia
2022
≈ 76%
A Cognitive Architecture for Machine Consciousness and Artificial Superintelligence: Thought Is Structured by the Iterative Updating of Working Memory
Jared Edward Reser
2024
≈ 76%
Beyond Geometry: Comparing the Temporal Structure of Computation in Neural Circuits with Dynamical Similarity Analysis
Adam Eisen, Leo Kozachkov, Ila Fiete Mitchell Ostrow
2023
≈ 76%
Identifying Connectivity Distributions from Neural Dynamics Using Flows
Ulises Pereira-Obilinovic, Yiliu Wang, Eric Shea-Brown, Uygar S\"umb\"ul Timothy Doyeon Kim
2026
≈ 76%
Artificial Intelligence Software Structured to Simulate Human Working Memory, Mental Imagery, and Mental Continuity
Jared Edward Reser
2026
≈ 75%
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models
Juergen Schmidhuber
2015
≈ 75%
Meta Neural Coordination
Yuwei Sun
2023
≈ 75%
Learning without neurons in physical systems
in corpus
2022
≈ 74%
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
in corpus
2025
≈ 74%
Relating transformers to models and neural representations of the hippocampal formation
in corpus
2021
≈ 73%
The Platonic Representation Hypothesis
in corpus
2024
≈ 71%
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
in corpus
2023
≈ 71%
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts
in corpus
2026
≈ 71%
Self-Improvising Memory: A Perspective on Memories as Agential, Dynamically Reinterpreting Cognitive Glue
in corpus
2024
≈ 71%
The Causally Emergent Alignment Hypothesis: Causal Emergence Aligns with and Predicts Final Reward in Reinforcement Learning Agents
in corpus
2026
≈ 70%
Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
in corpus
≈ 70%
The World Inside Neural Networks
in corpus
2026
≈ 70%
Differentiable Logic Cellular Automata: From Game of Life to pattern generation with learned recurrent circuits
in corpus
≈ 70%
Model Alignment Search
in corpus
2025
≈ 69%

Similar preprints — Semantic Scholar

Cited by (4)

Addressing divergent representations from causal interventions on neural networks
Causal intervention methods central to mechanistic interpretability—including activation patching, mean-difference vector patching, Sparse Autoencoders, and Distributed Alignment Search (DAS)—systemat
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?
Under arbitrarily powerful alignment maps, causal abstraction becomes vacuous: any neural network can be perfectly mapped to any algorithm, a result proven formally in Theorem 1 under five mild assump
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
Manifold steering — intervening on model activations along paths constrained to lie on a learned activation manifold M_h rather than along Euclidean linear directions — produces behavioral trajectorie
Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts
Llama-3.1-8B solves cyclic arithmetic (e.g., "what month is six months after August?") not by performing modular addition in the period of the cyclic concept (12 for months, 7 for days of the week) as