paper:doi-10-3389-fncom-2015-00136Dopamine, reward learning, and active inference
Original abstract (expand)
Temporal difference learning models propose phasic dopamine signalling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behaviour. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings.
Related work— refs + corpus + external arXiv
Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.
- Learning Perception and Planning with Deep Active InferenceTim Verbelen, Johannes Nauta, Cedric De Boom and Bart Dhoedt Ozan \c{C}atal2020≈ 80%
- Reinforcement Learning through Active InferenceBeren Millidge, Anil K. Seth, Christopher L. Buckley Alexander Tschantz2020≈ 78%
- ≈ 78%
- Prior Preference Learning from Experts:Designing a Reward with Active InferenceCheolhyeong Kim, Hyung Ju Hwang Jin young Shin2021≈ 77%
- Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action LoopChristian Guckelsberger (2), Christoph Salge (3 and 4), Sim\'on C. Smith (4 and 5), Daniel Polani (4) ((1) Araya Inc., Tokyo, Japan, (2) Computational Creativity Group, Department of Computing, Goldsmiths, University of London, London, UK, (3) Game Innovation Lab, Department of Computer Science and Engineering, New York University, New York City, NY, USA, (4) Sepia Lab, Adaptive Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield, UK, (5) Institute of Perception, Action and Behaviour, School of Informatics, The University of Edinburgh, UK) Martin Biehl (1)2018≈ 77%
- Active inference and artificial reasoningLancelot Da Costa, Alexander Tschantz, Conor Heins, Christopher Buckley, Tim Verbelen, Thomas Parr Karl Friston2025≈ 77%
- Active inference: demystified and comparedPhilip J. Ball, Thomas Parr, Karl J. Friston Noor Sajid2021≈ 77%
- ≈ 77%
- ≈ 77%
- Active inference: demystified and comparedin corpus2021≈ 76%
- Active Inference on the Edge: A Design StudyVictor Casamayor Pujol, Praveen Kumar Donta, Schahram Dustdar Boris Sedlak2023≈ 76%
- Active Inference or Control as Inference? A Unifying ViewAbraham Imohiosen, Jan Peters Joe Watson2020≈ 76%
- Active inference for action-unaware agentsKeisuke Suzuki, Ryota Kanai, Manuel Baltieri Filippo Torresan2025≈ 76%
- Distributional Active InferenceGulcin Baykal, Manuel Hau{\ss}mann, Mustafa Mert \c{C}elikok, Melih Kandemir Abdullah Akg\"ul2026≈ 76%
- Deconstructing deep active inferenceTh\'eophile Champion and Marek Grze\'s and Lisa Bonheme and Howard Bowman2023≈ 76%
- Active Inference: A Process Theoryin corpus2017≈ 76%
- Reward Maximisation through Discrete Active InferenceNoor Sajid, Thomas Parr, Karl Friston, Ryan Smith Lancelot Da Costa2022≈ 76%
- ≈ 75%
- ≈ 74%
- Why Learning Requires Feelingin corpus2026≈ 71%
- Active Inference, Curiosity and Insightin corpus2017≈ 71%
- ≈ 71%
- Learning without neurons in physical systemsin corpus2022≈ 69%
- ≈ 68%
- ≈ 68%
- ≈ 68%
- ≈ 68%
- ≈ 67%
- ≈ 67%
- Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencodersin corpus2026≈ 67%
Similar preprints — Semantic Scholar
Cited by (2)
- Active Inference: A Process Theory
A single variational principle—minimizing variational free energy via gradient descent on a Markov decision process (MDP) generative model—is sufficient to derive neuronal dynamics that reproduce, wit
- Active inference on discrete state-spaces: a synthesis
Active inference on discrete state-spaces, formalized as partially observable Markov decision processes (POMDPs) with likelihood matrix A, transition matrix B, and prior D, unifies perception, plannin