paper:doi-10-1371-journal-pone-0006421Reinforcement Learning or Active Inference?
Original abstract (expand)
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.
Related work— refs + corpus + external arXiv
Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.
- Reinforcement Learning through Active InferenceBeren Millidge, Anil K. Seth, Christopher L. Buckley Alexander Tschantz2020≈ 83%
- Active inference: demystified and comparedPhilip J. Ball, Thomas Parr, Karl J. Friston Noor Sajid2021≈ 82%
- Distributional Active InferenceGulcin Baykal, Manuel Hau{\ss}mann, Mustafa Mert \c{C}elikok, Melih Kandemir Abdullah Akg\"ul2026≈ 81%
- ≈ 81%
- Active Inference and Reinforcement Learning: A unified inference on continuous state and action spaces under partial observabilityParvin Malekzadeh and Konstantinos N. Plataniotis2024≈ 80%
- Active inference: demystified and comparedin corpus2021≈ 80%
- ≈ 80%
- Learning Perception and Planning with Deep Active InferenceTim Verbelen, Johannes Nauta, Cedric De Boom and Bart Dhoedt Ozan \c{C}atal2020≈ 80%
- Prior Preference Learning from Experts:Designing a Reward with Active InferenceCheolhyeong Kim, Hyung Ju Hwang Jin young Shin2021≈ 79%
- Active Inference or Control as Inference? A Unifying ViewAbraham Imohiosen, Jan Peters Joe Watson2020≈ 78%
- Bayesian policy selection using active inferenceJohannes Nauta, Tim Verbelen, Pieter Simoens and Bart Dhoedt Ozan \c{C}atal2019≈ 78%
- Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action LoopChristian Guckelsberger (2), Christoph Salge (3 and 4), Sim\'on C. Smith (4 and 5), Daniel Polani (4) ((1) Araya Inc., Tokyo, Japan, (2) Computational Creativity Group, Department of Computing, Goldsmiths, University of London, London, UK, (3) Game Innovation Lab, Department of Computer Science and Engineering, New York University, New York City, NY, USA, (4) Sepia Lab, Adaptive Systems Research Group, Department of Computer Science, University of Hertfordshire, Hatfield, UK, (5) Institute of Perception, Action and Behaviour, School of Informatics, The University of Edinburgh, UK) Martin Biehl (1)2018≈ 77%
- Online reinforcement learning with sparse rewards through an active inference capsuleCharel van Hoof (1), Beren Millidge (2) ((1) Delft University of Technology, (2) University of Oxford) Alejandro Daniel Noel (1)2021≈ 77%
- Deconstructing deep active inferenceTh\'eophile Champion and Marek Grze\'s and Lisa Bonheme and Howard Bowman2023≈ 77%
- Active inference for action-unaware agentsKeisuke Suzuki, Ryota Kanai, Manuel Baltieri Filippo Torresan2025≈ 77%
- Reward Maximisation through Discrete Active InferenceNoor Sajid, Thomas Parr, Karl Friston, Ryan Smith Lancelot Da Costa2022≈ 77%
- Active Inference: A Process Theoryin corpus2017≈ 71%
- ≈ 70%
- ≈ 70%
- Active Inference, Curiosity and Insightin corpus2017≈ 68%
- ≈ 68%
- SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agentsin corpus2025≈ 67%
- ≈ 66%
- ≈ 65%
- Simulators — LessWrongin corpus≈ 65%
- Why Learning Requires Feelingin corpus2026≈ 65%
- ≈ 65%
- ≈ 65%
- ≈ 64%
- Learning without neurons in physical systemsin corpus2022≈ 64%
Similar preprints — Semantic Scholar
Cited by (1)
- Active inference on discrete state-spaces: a synthesis
Active inference on discrete state-spaces, formalized as partially observable Markov decision processes (POMDPs) with likelihood matrix A, transition matrix B, and prior D, unifies perception, plannin