paper
active
2020
212
paper:dacosta-2020-active-inference-discrete

Active inference on discrete state-spaces: a synthesis

TL;DR

Active inference on discrete state-spaces, formalized as partially observable Markov decision processes (POMDPs) with likelihood matrix A, transition matrix B, and prior D, unifies perception, planning, decision-making, learning, and structure learning under two objective functions: variational free energy (an upper bound on surprise minimized during state estimation) and expected free energy G(π) (minimized during policy selection). The synthesis derives neuronal dynamics from first principles via gradient descent on free energy, showing that state estimation corresponds to a softmax function of accumulated prediction errors—equations interpretable as membrane potentials mapping to firing rates—and that these dynamics coincide exactly with variational message passing, while the Bethe approximation yields belief propagation. Policy selection follows Q(π) = σ(−G(π)), where G decomposes into risk (KL divergence between predicted and preferred states) and ambiguity (expected entropy of outcomes given states), formally subsuming KL control, expected utility theory, and optimal Bayesian design as special cases. Learning of A follows Dirichlet parameter accumulation **a** = a + Σ(oτ ⊗ sτ), which is formally equivalent to Hebbian plasticity, while structure learning proceeds via Bayesian model reduction (BMR) for simplification and Bayesian model expansion for concept acquisition, with the marginal approximation implemented in `spm_MDP_VB_X.m` identified as the most biologically plausible free energy approximation. The paper argues this implies that biological cognition—from saccadic sampling at ~4 Hz to dopaminergic precision encoding γ—is fully accountable as free energy minimization, and that the outstanding challenge is identifying the evidence-maximizing generative model an agent actually employs, which would constitute a complete structure learning roadmap.

What to take away

  1. 1. Active inference on discrete state-spaces formalizes agents as POMDPs with three core matrices—likelihood A, transition B, and initial-state prior D—whose inversion via variational free energy minimization constitutes perception.
  2. 2. Policy selection is governed by Q(π) = σ(−G(π)), where expected free energy G decomposes into a risk term (KL divergence between predicted and preferred states) and an ambiguity term (expected entropy of outcomes given hidden states), formally unifying goal-directed and exploratory behavior.
  3. 3. The neuronal dynamics for state estimation—a gradient descent producing **s**πτ = σ(v), v̇ = −∇F—are mathematically equivalent to variational message passing, and switching to the Bethe approximation recovers belief propagation, establishing a formal bridge between active inference and message-passing accounts of neural computation.
  4. 4. Learning the likelihood matrix A follows the Dirichlet update **a** = a + Σ(oτ ⊗ **s**τ) accumulated over T timesteps per trial, which is formally identical to Hebbian/associative plasticity and increases agent confidence monotonically with experience.
  5. 5. The expected free energy G subsumes at least five existing theoretical frameworks depending on which uncertainty terms are removed: information gain (no preferences), KL control (no ambiguity), risk-sensitive control (β = 0 Gibbs energy), expected utility theory (no ambiguity or intrinsic value), and the maximum entropy principle (unambiguous world with uninformative priors).
  6. 6. The marginal approximation to free energy, implemented in `spm_MDP_VB_X.m` rather than the mean-field factorization detailed in the paper's main derivations, currently stands as the most biologically plausible approximation because it retains neuronal interpretability while approaching the accuracy of the Bethe approximation.
  7. 7. Bayesian model reduction (BMR) enables post-hoc structure learning by analytically comparing reduced versus full model evidence via log P̃(o) − log P(o) = log E_{P(ν|o)}[P̃(ν)/P(ν)], providing a biologically interpretable mechanism for synaptic pruning and emulating sleep-like consolidation.
  8. 8. The precision parameter γ multiplying G(π) inside the softmax encodes dopaminergic confidence in policy selection, a correspondence supported by fMRI validation of dopaminergic midbrain encoding of expected certainty (Schwartenbeck et al., Cerebral Cortex 25(10):3434–3445, 2015).
  9. 9. An open question the paper raises is how biological agents tractably search deep policy trees: while Occam-window pruning reduces evaluation cost, it cannot scale to long temporal horizons, and hierarchical (semi-Markovian) generative models with nested timescales are proposed but not fully characterized as a solution.
  10. 10. To replicate the perception update, a researcher can implement equations (8)–(9): compute free energy gradient ∇F per policy using matrices A, B, D and observed outcomes, accumulate in a leaky integrator v, and pass through softmax to obtain posterior state beliefs **s**πτ, iterating within each observation epoch at a timescale faster than the ~4 Hz saccadic sampling rate.

Peer brief — for seminar discussion

Da Costa and colleagues (2020, Journal of Mathematical Psychology) provide a complete mathematical synthesis of active inference on discrete state-space generative models—specifically POMDPs parameterized by likelihood matrix A, transition matrix B, and initial-state prior D—deriving the full process theory from first principles rather than presenting it as an informal collection of update rules. The paper introduces no single novel algorithm but rather the consolidated derivation itself, which it calls the discrete-state active inference process theory, tracing every update equation from the variational free energy functional through to biologically interpretable neuronal dynamics. An alternative synthesis strategy the paper could have used is the Bethe free energy approximation throughout (rather than the structured mean-field factorization in Equation 4), which the authors acknowledge would yield belief propagation dynamics and is arguably more accurate, though they defer to mean-field for didactic clarity while noting that `spm_MDP_VB_X.m` already uses the marginal approximation as a compromise. The load-bearing finding is a chain of equivalences: (1) state estimation via gradient descent on variational free energy F produces dynamics mathematically identical to variational message passing; (2) policy selection via Q(π) = σ(−G(π)) produces a quantity G that decomposes into risk and ambiguity, subsuming KL control, expected utility, intrinsic motivation, and optimal Bayesian design as special cases; and (3) learning the A matrix via the Dirichlet accumulation rule **a** = a + Σ(oτ ⊗ **s**τ) over T timesteps is formally identical to Hebbian plasticity. The precision parameter γ scaling G(π) maps to dopaminergic firing, a correspondence validated empirically in Cerebral Cortex 25(10):3434–3445. Visual saccadic sampling occurs at approximately 4 Hz, and the paper argues faster within-timestep neuronal dynamics (consistent with gamma/beta bursts observed in working memory studies) implement the gradient descent in peristimulus time. The implied prediction is that a complete structure learning roadmap—combining Bayesian model reduction for pruning and Bayesian model expansion for concept acquisition—would identify the evidence-maximizing generative model entailed by any biological agent purely from behavioral data, thereby enabling accurate in-silico replication of that agent's electrophysiology. A critical reader would push back on the biological plausibility claim most directly: the paper asserts that the marginal approximation in `spm_MDP_VB_X.m` is the most biologically plausible free energy scheme, but this claim rests on face validity (synthesized ERP responses resembling mismatch negativity, theta-gamma coupling, etc.) rather than rigorous quantitative comparison against empirical neural data with competing models held constant. The paper acknowledges that Bayesian model comparison across alternative free energy approximations—mean-field, Bethe, and marginal—using actual electrophysiological recordings has not been performed, making the plausibility argument largely circular: the framework generates signals that look like the data it was designed to explain, without a fully pre-registered or out-of-sample test against, say, dynamic causal modeling fits or reinforcement learning baselines on the same datasets.

Methods (6)

  • Belief Propagation
    Inference mechanism underlying active inference; updates posterior beliefs via gradient descent on free energy.
  • Dirichlet Parameter Accumulation
    Learning rule for updating Dirichlet beliefs about likelihood matrix A by adding outer products of observations and state estimates.
  • Hebbian Plasticity Update
    Synaptic update rule that is formally identical to associative learning; used for learning A.
  • Mean-Field Approximation
    Variational technique used in active inference to tractably compute posterior beliefs.
  • Occam Window Pruning
    Pruning policy trees by discarding policies whose expected free energy exceeds that of the best by a threshold.
  • Variational Bayes
    Mathematical framework for approximating posterior beliefs; converts exact Bayesian inference into optimization.

Frameworks (14)

  • Active Inference
    Foundational framework by Karl Friston; the paper extends it to three hierarchical levels for modeling meta-awareness.
  • Bayesian Brain Hypothesis
    Normative theory proposing biological systems perform approximate Bayesian inference through free energy minimization.
  • Bayesian Decision Theory
    Framework for maximizing expected utility under uncertainty.
  • Bethe Approximation
    Free energy approximation using two-node marginals.
  • Expected Utility Theory
    Economic framework for decision-making under risk.
  • Free Energy Principle
    A foundational variational principle from statistical physics that formalizes how self-organizing systems maintain structural integrity and adapt to their environment by minimizing free energy—a mathematical bound on surprise or prediction error. Originally developed by Karl Friston, the framework unifies action, perception, and learning as processes of active inference, where systems both update internal models of the world and act upon it to reduce the divergence between predictions and observations.
  • KL Control / Risk-Sensitive Control
    Control approach that minimizes KL divergence to a target distribution; underlies risk term in expected free energy.
  • Marginal Free Energy Approximation
    Biologically plausible approximation lying between mean-field and Bethe approximations.
  • Optimal Bayesian Design
    Selecting actions to maximize expected information gain.
  • Optimal Control Theory
    Design of controllers to minimize a cost function.
  • Partially Observable Markov Decision Process (POMDP)
    Modeling framework for discrete state-space decision-making under uncertainty, used as generative model in active inference.
  • Predictive Processing
  • Reinforcement Learning
    Alternative framework for agent behavior; based on reward maximization rather than free energy minimization.
  • Variational Message Passing
    Algorithm for approximate Bayesian inference based on mean-field approximation.

Claims (35)

Original abstract (expand)

Active inference is a normative principle underwriting perception, action, planning, decision-making and learning in biological or artificial agents. From its inception, its associated process theory has grown to incorporate complex generative models, enabling simulation of a wide range of complex behaviours. Due to successive developments in active inference, it is often difficult to see how its underlying principle relates to process theories and practical implementation. In this paper, we try to bridge this gap by providing a complete mathematical synthesis of active inference on discrete state-space models. This technical summary provides an overview of the theory, derives neuronal dynamics from first principles and relates this dynamics to biological processes. Furthermore, this paper provides a fundamental building block needed to understand active inference for mixed generative models; allowing continuous sensations to inform discrete representations. This paper may be used as follows: to guide research towards outstanding challenges, a practical guide on how to implement active inference to simulate experimental behaviour, or a pointer towards various in-silico neurophysiological responses that may be used to make empirical predictions.

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

+28 more

Similar preprints — Semantic Scholar

Cross-corpus bridges (10)

same_concept_as · Nomic cosine

External markdown files that talk about the same concept as this entity.

  • aboutblank_kb
    Active Inferenceframeworks/active-inference.md0.865
  • aboutblank_kb
    Free Energy Principle And Active Inferenceframeworks/free-energy-principle-and-active-inference.md0.853
  • aboutblank_kb
    Does the Free Energy Principle adequately explain morphogenesis and pattern formation in biological systems?questions/does-the-free-energy-principle-adequately-explain-morphogenesis.md0.807
  • aboutblank_kb
    Bayesian Inference Model Of Morphogenesisframeworks/bayesian-inference-model-of-morphogenesis.md0.797
  • aboutblank_kb
    Surprise Minimization Frameworkframeworks/surprise-minimization-framework.md0.794
  • aboutblank_kb
    Free Energy Principleframeworks/free-energy-principle.md0.791
  • aboutblank_kb
    Multi-Level Bayesian Inferenceframeworks/multi-level-bayesian-inference.md0.789
  • aboutblank_kb
    Bayesian Mechanicsframeworks/bayesian-mechanics.md0.788
  • aboutblank_kb
    Steve Frankthinkers/steve-frank.md0.787
  • aboutblank_kb
    Susan Lindquistthinkers/susan-lindquist.md0.781