Variational Algorithms for Approximate Bayesian Inference

ByMatthew J. Beal

Original abstract (expand)

The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents a unified variational Bayesian (VB) framework which approximates these computations in models with latent variables using a lower bound on the marginal likelihood.\n\nChapter 1 presents background material on Bayesian inference, graphical models, and propagation algorithms. Chapter 2 forms the theoretical core of the thesis, generalising the expectation- maximisation (EM) algorithm for learning maximum likelihood parameters to the VB EM algorithm which integrates over model parameters. The algorithm is then specialised to the large family of conjugate-exponential (CE) graphical models, and several theorems are presented to pave the road for automated VB derivation procedures in both directed and undirected graphs (Bayesian and Markov networks, respectively).\n\nChapters 3–5 derive and apply the VB EM algorithm to three commonly-used and important models: mixtures of factor analysers, linear dynamical systems, and hidden Markov models. It is shown how model selection tasks such as determining the dimensionality, cardinality, or number of variables are possible using VB approximations. Also explored are methods for combining sampling procedures with variational approximations, to estimate the tightness of VB bounds and to obtain more effective sampling algorithms. Chapter 6 applies VB learning to a long-standing problem of scoring discrete-variable directed acyclic graphs, and compares the performance to annealed importance sampling amongst other methods. Throughout, the VB approximation is compared to other methods including sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC. The thesis concludes with a discussion of evolving directions for model selection including infinite models and alternative approximations to the marginal likelihood.

Related work— refs + corpus + external arXiv

Cited / in-corpus / arXiv badges show which signals surfaced each row. Multi-source rows weighted higher.

Bayesian score calibration for approximate models
David J Warne, David J Nott, Christopher Drovandi Joshua J Bon
2026
≈ 75%
Bayesian Matrix Decomposition and Applications
Jun Lu
2026
≈ 74%
Generative Bayesian Inference with GANs
Veronika Ro\v{c}kov\'a Yuexi Wang
2026
≈ 74%
Derivation of the Variational Bayes Equations
Alianna J. Maren
2024
≈ 73%
Permutation-based Inference for Variational Learning of Directed Acyclic Graphs
Pantelis Elinas, He Zhao, Maurizio Filippone, Vassili Kitsios, Terry O'Kane Edwin V. Bonilla
2026
≈ 72%
Designing Perceptual Puzzles by Differentiating Probabilistic Programs
Tzu-Mao Li, Joshua Tenenbaum, Jonathan Ragan-Kelley Kartik Chandra
2022
≈ 72%
A Framework for Improving the Reliability of Black-box Variational Inference
Michael Riis Andersen, Aki Vehtari, Jonathan H. Huggins Manushi Welandawe
2025
≈ 72%
Active Inference for Binary Symmetric Hidden Markov Models
Armen E. Allahverdyan and Aram Galstyan
2015
≈ 71%
A Framework for Variational Inference of Lightweight Bayesian Neural Networks with Heteroscedastic Uncertainties
Ryan Brown, Michael Merritt, Samuel Park, Delsin Menolascino, Mark A. Peot David J. Schodt
2026
≈ 71%
Approximate Bayesian inference as a gauge theory
Biswa Sengupta and Karl Friston
2017
≈ 71%
Riemannian Laplace Approximation with the Fisher Metric
Marcelo Hartmann, Bernardo Williams, Mark Girolami, Arto Klami Hanlin Yu
2026
≈ 71%
The Bounded Bayesian
Kathryn Blackmond Laskey
2013
≈ 71%
Compositional Active Inference II: Polynomial Dynamics. Approximate Inference Doctrines
Toby St. Clere Smithe
2022
≈ 70%
Uncertainty Quantification and Propagation in Surrogate-based Bayesian Inference
Javier Enrique Aguilar, Anneli Guthke, Paul-Christian B\"urkner Philipp Reiser
2026
≈ 70%
A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation
R.J.G. van Sloun, C.G.A. Viviers, P.H.N. de With, F. van der Sommen M.M.A. Valiuddin
2026
≈ 70%
Active Inference, Curiosity and Insight
in corpus
2017
≈ 68%
Active inference on discrete state-spaces: a synthesis
in corpus
2020
≈ 68%
Active Inference: A Process Theory
in corpus
2017
≈ 67%
A Free energy principle for the brain (lecture summary)
in corpus
2008
≈ 63%
Active inference: demystified and compared
in corpus
2021
≈ 63%
Differentiable Logic Cellular Automata: From Game of Life to pattern generation with learned recurrent circuits
in corpus
≈ 62%
A tale of two densities: active inference is enactive inference
in corpus
2020
≈ 61%
Interpreting Language Model Parameters
in corpus
2026
≈ 59%
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
in corpus
2023
≈ 59%
Learning without neurons in physical systems
in corpus
2022
≈ 59%
Addressing divergent representations from causal interventions on neural networks
in corpus
2025
≈ 59%
The biogenic approach to cognition
in corpus
2005
≈ 59%
Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders
in corpus
2026
≈ 59%
Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
in corpus
≈ 59%
Explaining 4.2 million genetic variants with state-of-the-art, interpretable predictions
in corpus
2026
≈ 58%

Similar preprints — Semantic Scholar

Cited by (3)

Active Inference, Curiosity and Insight
Minimizing expected variational free energy under a discrete-state Markov decision process generative model is sufficient to produce curiosity, epistemic learning, and insight without any additional m
Active Inference: A Process Theory
A single variational principle—minimizing variational free energy via gradient descent on a Markov decision process (MDP) generative model—is sufficient to derive neuronal dynamics that reproduce, wit
Life as we know it
Any ergodic random dynamical system possessing a Markov blanket will, almost surely, appear to engage in active inference and maintain autopoietic integrity—making biological self-organization not a r