Bayesian Inference
Remember that using Bayes’ Theorem doesn’t make you a Bayesian. Quantifying uncertainty with probability makes you a Bayesian. (Michael Betancourt)
Overview
- Books
- Bayesian Workflow (2020) Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian Bürkner, Martin Modrák
Markov chain Monte Carlo (MCMC)
Stochastic Gradient MCMC (SG-MCMC)
- SGLD Stochastic Gradient Langevin Dynamics
- Bayesian Learning via Stochastic Gradient Langevin Dynamics (2011) Max Welling, Yee Whye Teh — Shows that adding calibrated noise to stochastic gradient descent produces asymptotically exact posterior samples, enabling Bayesian inference to scale to large datasets for the first time without full-batch MCMC.
- SGHMC Stochastic Gradient Hamiltonian Monte Carlo
- Stochastic Gradient Hamiltonian Monte Carlo (2014) Tianqi Chen, Emily B. Fox, Carlos Guestrin — Extends SGLD by incorporating momentum (as in HMC above), adding a friction term to correct for gradient noise and improving mixing over the random-walk behavior of SGLD.
- A Complete Recipe for Stochastic Gradient MCMC (2015) Yi-An Ma, Tianqi Chen, Emily B. Fox — Provides a unifying framework showing that SGLD, SGHMC, and other SG-MCMC variants are all special cases of continuous Markov processes parameterized by two matrices, and introduces new samplers like SGRHMC within this framework.
Sequential Monte Carlo (SMC)
- Sequential Monte Carlo Methods in Practice (2001) Arnaud Doucet, Nando de Freitas, Neil Gordon (Editors) — The foundational reference on particle filters: propagates weighted samples through a sequence of distributions, enabling online inference in state-space models where MCMC would require re-running from scratch.
- Sequential Monte Carlo Samplers (2006) Pierre Del Moral, Arnaud Doucet, Ajay Jasra — Generalizes SMC beyond filtering to sample from arbitrary sequences of static distributions, making it applicable to Bayesian model comparison and tempered posteriors — the key theoretical bridge between particle filters and general Bayesian computation.
- An Introduction to Sequential Monte Carlo (2020) Nicolas Chopin, Omiros Papaspiliopoulos — Modern textbook treatment covering both the theory (Feynman-Kac formalism) and practice of SMC, including waste-free SMC and connections to tempering strategies used in modern samplers.
Approximate Bayesian Computation (ABC)
- Approximate Bayesian Computation in Population Genetics (2002) Mark A. Beaumont, Wenyang Zhang†and, David J. Balding
- Markov chain Monte Carlo without likelihoods (2003) Paul Marjoram, John Molitor, Vincent Plagnol, Simon Tavare
- Sequential Monte Carlo without likelihoods (2007) S. A. Sisson, Y. Fan†, Mark M. Tanak
- Non-linear regression models for Approximate Bayesian Computation (2009) Michael G.B. Blum, Olivier François
- Likelihood-free Markov chain Monte Carlo (2010) Scott A. Sisson, Yanan Fan
- Approximate Bayesian Computation(ABC) in practice (2010) Katalin Csillery, Michael G.B. Blum, Oscar E. Gaggiotti, Olivier Francois
- Hamiltonian ABC
- Reliable ABC model choice via random forests (2016) Pierre Pudlo, Jean-Michel Marin, Arnaud Estoup, Jean-Marie Cornuet, Mathieu Gautier, Christian P. Robert
Variational Inference (VI)
- Bayesian parameter estimation viavariational methods (1999) T.S. Jaakkola, M.I. Jordan
- The Variational Gaussian Approximation Revisited (2009) Manfred Opper, Cedric Archambeau
- Doubly Stochastic Variational Bayes for non-Conjugate Inference (2014) Michalis K. Titsias, Miguel Lazaro-Gredilla
- Variational Inference: A Review for Statisticians (2018) David M. Blei, Alp Kucukelbir, Jon D. McAuliffe
- Advances in Variational Inference (2018) Cheng Zhang, Judith Butepage, Hedvig Kjellstrom, Stephan Mandt — Comprehensive review organizing the VI landscape into four threads: scalable VI (stochastic optimization), generic VI (non-conjugate models), accurate VI (beyond mean-field, including normalizing flows), and amortized VI (inference networks).
- SVGD Stein Variational Gradient Descent
Normalizing Flows for Inference
- Variational Inference with Normalizing Flows (2015) Danilo Rezende, Shakir Mohamed — Introduces the idea of transforming a simple variational posterior through a chain of invertible mappings, breaking free of the mean-field assumption that limits standard VI and enabling arbitrarily complex approximate posteriors.
- Normalizing Flows for Probabilistic Modeling and Inference (2021) George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan — Definitive review of flow architectures (coupling, autoregressive, residual), their expressive power, and applications spanning density estimation, variational inference, and simulation-based inference.
- Model-Informed Flows for Bayesian Inference (2025) Joohwan Ko, Justin Domke — Proves that Variationally Inferred Parameters (VIP) can be represented exactly as autoregressive flows augmented with the model’s prior, then exploits this connection to design Model-Informed Flows that deliver tighter posteriors for hierarchical Bayesian models.
Expectation Propagation (EP)
- Expectation Propagation for Approximate Bayesian Inference (2001) Thomas P. Minka — Proposes a deterministic alternative to MCMC and VI that iteratively refines local likelihood approximations by moment matching, unifying assumed-density filtering and loopy belief propagation; often more accurate than the Laplace approximation (below) and variational Bayes at comparable cost.
- Expectation Propagation as a Way of Life (2020) Aki Vehtari, Andrew Gelman, Tuomas Sivula, Pasi Jylänki, Dustin Tran, Swupnil Sahai, Paul Blomstedt, John P. Cunningham, David Schiminovich, Christian Robert — Reframes EP as a framework for distributed Bayesian inference: data partitions communicate through iteratively refined approximate likelihoods, enabling parallelism while preserving information sharing — addressing scalability limits of both standard EP and MCMC.
Laplace Approximation
- A Practical Bayesian Framework for Backpropagation Networks (1992) David J.C. MacKay — Pioneering work applying a second-order Taylor expansion (Laplace approximation) around the MAP estimate to approximate the posterior over neural network weights, enabling model comparison via the Bayesian evidence — the simplest deterministic approach to Bayesian neural networks.
- Laplace Redux — Effortless Bayesian Deep Learning (2021) Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig — Revives MacKay’s Laplace approach for modern deep networks with scalable Kronecker-factored and last-layer approximations; shows it is competitive with MC Dropout and ensembles (see Bayesian Deep Learning below) at a fraction of the cost, and provides the
laplace-torch library.
Simulation-Based Inference (SBI)
- The Frontier of Simulation-Based Inference (2020) Kyle Cranmer, Johann Brehmer, Gilles Louppe — Landmark review of the shift from classical ABC methods (above) to neural network-based likelihood-free inference; surveys how neural density estimators, classifiers, and ratio estimators replace the rejection/tolerance mechanisms of ABC with learned surrogates.
- NPE Neural Posterior Estimation
- NLE Neural Likelihood Estimation
- Sequential Neural Likelihood (2019) George Papamakarios, David Sterratt, Iain Murray — Instead of learning the posterior directly (as NPE does), learns a neural surrogate of the likelihood using autoregressive flows, then plugs it into standard MCMC — more robust to model misspecification and composable with different priors without retraining.
- NRE Neural Ratio Estimation
- Approximating Likelihood Ratios with Calibrated Discriminative Classifiers (2015) Kyle Cranmer, Juan Pavez, Gilles Louppe — Trains a classifier to distinguish parameter-data pairs, whose output directly estimates the likelihood ratio — avoids density estimation entirely, requiring only a binary classification objective, and is well-suited to hypothesis testing.
- Benchmarking Simulation-Based Inference (2021) Jan-Matthis Lueckmann, Jan Boelts, David Greenberg, Pedro Goncalves, Jakob Macke — Systematic comparison of NPE, NLE, NRE and classical ABC on standardized tasks; finds that neural methods consistently outperform ABC but no single algorithm dominates, and that sequential variants improve sample efficiency.
Diffusion Models for Posterior Sampling
- Score-Based Generative Modeling through Stochastic Differential Equations (2021) Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole — Unifies score matching and diffusion models as continuous-time SDEs that gradually corrupt data into noise and reverse the process via learned score functions; provides the theoretical foundation for using diffusion models as priors in Bayesian inverse problems.
- Diffusion Posterior Sampling for General Noisy Inverse Problems (2023) Hyungjin Chung, Jeongsol Kim, Michael T. McCann, Marc L. Klasky, Jong Chul Ye — Combines a pretrained diffusion prior (from above) with a measurement likelihood to sample from the Bayesian posterior for inverse problems, using manifold-constrained gradients to handle both linear and nonlinear forward models with noise.
- Score-based diffusion models for diffuse optical tomography with uncertainty quantification (2026) Fabian Schneider, Meghdoot Mozumder, Konstantin Tamarov, Leila Taghizadeh, Tanja Tarvainen, Tapio Helin, Duc-Lam Duong — Applies the diffusion posterior sampling framework to medical imaging, introducing a regularization strategy that blends learned and model-based scores to prevent overfitting; demonstrates calibrated uncertainty estimates with lower variance than classical Bayesian methods.
Bayesian Deep Learning
- Weight Uncertainty in Neural Networks (2015) Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra — Introduces “Bayes by Backprop”: maintains a variational distribution over each weight (rather than a point estimate), optimizing the variational free energy with reparameterized gradients — the first practical VI method (see VI above) for modern deep networks.
- Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (2016) Yarin Gal, Zoubin Ghahramani — Reinterprets standard dropout training as approximate inference in a deep Gaussian process, enabling uncertainty estimates from any existing dropout network at test time with zero additional cost — far cheaper than Bayes by Backprop but less flexible.
- Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (2017) Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell — Proposes training multiple networks with random initialization as a non-Bayesian alternative for uncertainty; despite its simplicity, deep ensembles empirically match or outperform both MC Dropout and Bayes by Backprop on calibration and out-of-distribution detection.
- Bayesian Deep Learning and a Probabilistic Perspective of Generalization (2020) Andrew Gordon Wilson, Pavel Izmailov — Argues that deep ensembles succeed precisely because they approximate Bayesian model averaging, proposes MultiSWAG for cheaper within-basin marginalization, and shows that Bayesian averaging resolves pathologies like double descent.
- Bayesian Computation in Deep Learning (2025) Wenlong Chen, Bolian Li, Ruqi Zhang, Yingzhen Li — Recent review organizing the Bayesian deep learning toolbox around two computational pillars: SG-MCMC (see above) and VI, covering their challenges (multimodality, cold posteriors) and solutions specific to deep neural networks and deep generative models.
- See also: Bayesian Neural Networks in Neural Networks
Gaussian processes
Uncertainty calibration
- The Well-Calibrated Bayesian (1982) A.P. Dawid
- Transforming Classifier Scores into Accurate Multiclass Probability Estimates (2002) Bianca Zadrozny, Charles Elkan
- Predicting Good Probabilities With Supervised Learning (2005) Alexandru Niculescu-Mizil, Rich Caruana
- Nearly-Isotonic Regression (2011) Ryan J. Tibshirani, Holger Hoefling, Robert Tibshirani
- Binary classifier calibration using an ensemble of piecewise linear regression models (2012) Mahdi Pakdaman Naeini, Gregory F. Cooper
- Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers (2017) Meelis Kull, Telmo de Menezes e Silva Filho, Peter Flach
- Verified Uncertainty Calibration (2019) Ananya Kumar, Percy Liang, Tengyu Ma
- Improving Regression Uncertainty Estimates with an Empirical Prior (2020) Eric Zelikman, Christopher Healy
Software
- Stan
- BUGS
- JAGS
- R
- Python
- Julia
- Javascript
- Web
- Installable
Star
Issue