1
Markov chain Monte Carlo (MCMC) simulations
Christopher B. Germann (Ph.D., M.Sc., B.Sc. / Marie Curie Alumnus)
2018
URL: https://christopher-germann.de
Markov Chain Monte Carlo methods have transformed science, particularly due
to their relevance for probabilistic logical inferences. Monte Carlo methods
(sometimes termed Monte Carlo experiments) are a class of computational algo-
rithms that utilise repeated random sampling to calculate their statistical results.
The theoretical considerations behind Monte Carlo methods are old, for an ex-
ample see the Buffon-Laplace need problem (Buffon, 1777). The appellation “Monte
Carlo” is frequently attributed to the quantum physicist John von Neumann (inter
alia) and Monte Carlo methods have been discussed in many diverse scientific
contexts (Behrends, 2014; Dell & Franklin, 2009; Diaconis, 1976). However, they are
generally primarily applied to three broad classes of problems: mathematical op-
timization, numerical integration, and the generation of samples from probability
distributions (Kroese, Brereton, Taimre, & Botev, 2014).
Markov chains are based on the stochastic foundations developed by the Russian
mathematician Andreyevich Markov who extended the central limit theorem to
sequences of random variables (hence the eponymous name “Markov chains”). In
1912, his work was translated into German under the title Wahrscheinlich-
keitsrechnung, that is “probability calculus” (Markov, 1912). For a historical over-
view see Basharin, Langville, & Naumov (2004).
Markov chain Monte Carlo methods are based on the seminal work of the physi-
cist Nicholas Metropolis et alia (Metropolis, Rosenbluth, Rosenbluth, & Teller,
1953) who initially utilised the approach to simulated energy levels of atoms in
crystalline structures
1
. The concept was later generalised by the statistician W. K.
1
Currently, cutting-edge Monte Carlo-based methods are applied to simulate complex quantum systems. The
application of Monte Carlo methods to quantum physics is termed “Quantum Monte Carlo” (QMC). QCM has
been used to approximate the statistical properties of bosons and fermions and it comes in various flavours.
However, its foundation is based on the original seminal publication by Metropolis et al. (1953) titled “Equation
of State Calculations by Fast Computing Machines” in which the authors suggested that Monte Carlo methods
2
Hastings (1970). MCMC methods became more popular in the context of Bayesian
inference due to the work of Gelfand and Smith (1990) who emphasised the rele-
vance of MCMC methods to calculating Bayesian posterior densities. Recent com-
scientific disciplines (Diaconis, 2008; Dunn & Shultis, 2012). Today, the Metropo-
lisHastings algorithm is one of the most widely used MCMC algorithm in statis-
tics and in statistical physics. The MetropolisHastings algorithm comprises the
construction of a Markov process which, in theory, results in a unique and sta-
tionary distribution. For a detailed mathematical derivation of the algorithm see
Robert and Casella (2004). Gibbs sampling can be regarded as a special case of the
MetropolisHastings algorithm. Hence, MCMC methods based on the Gibbs sam-
pling enables the analyst to draw inferences from iterative simulations (Gelman &
Rubin, 1992). The quality of MCMC sample improves as a function of the number
of steps. MCMC methods thus provide sampling probability density functions
which can be utilised to obtain parameter estimates with their probabilistic asso-
ciated uncertainties. To be specific, the goal of the MCMC method is to obtain a
large set of samples (θ
l
, l = 1, …, L) from the posterior distribution in order to be in
a position to estimate the parameters in question with reasonable accuracy. As a
general heuristic rule, it has been stated that L =100 is sufficient for reasonable
posterior estimates. (Gelman, Carlin, Stern, & Rubin, 2004). A sample that is too
small can lead to inaccurate statistical inference. Stopping rules have been devel-
oped (Gong & Flegal, 2016) and the size of the Monte Carlo sample effects the
standard error. Several convergence diagnostics have been developed. The Monte
Carlo Error (MCE) is the uncertainty which can be attributed to the fact that the
number of simulation draws is always finite. In other words, it provides a quanti-
tative index that represents the quality of parameter estimates. For more infor-
mation on the Markov chain central limit theorem see Jones (2004). The MCSE
software package in R (which is freely available online) provides
convenient tools for computing Monte Carlo standard errors and the effective
sample size (James Flegal, Hughes, Vats, Dai, & Dootika Vats, 2017). Notice that
relatively small MCSEs indicate high estimation precision level. The main idea is
to terminate the simulation when an estimate is sufficiently accurate for the sci-
entific purpose of the analysis. Many practitioners utilize convergence diagnos-
tics and visual inspections to evaluate if the chain has been run long enough.
could in principle be utilised for “calculating the properties of any substance” (p.1087). Simulated Quantum An-
nealing (SQA) is a Markov Chain Monte-Carlo algorithm that has been utilised to simulating quantum anneal-
ing. In addition, SQA can be applied to multidimensional combinatorial optimization problems and it has been
argued that it inherits some of the advantages of quantum tunnelling (but see Crosson & Harrow, 2016).
3
In sum, MCMC are utilised to estimate characteristics of a posterior distribution
by constructing a Markov chain with the target as its equilibrium distribution.
References
Behrends, E. (2014). Buffon: Hat er Stöckchen geworfen oder hat er nicht?
RETROSPEKTIVE, 22, 5052. https://doi.org/10.1515/dmvm-2014-0022
Buffon, G. (1777). Essai darithmétique morale. Histoire Naturelle, Générale Er
Particulière, (Supplément 4), 46123.
Crosson, E., & Harrow, A. W. (2016). Simulated Quantum Annealing Can Be
Exponentially Faster Than Classical Simulated Annealing. In Proceedings -
Annual IEEE Symposium on Foundations of Computer Science, FOCS (Vol.
2016-Decem, pp. 714723). https://doi.org/10.1109/FOCS.2016.81
Dell, Z. E., & Franklin, S. V. (2009). The Buffon-Laplace needle problem in three
dimensions. Journal of Statistical Mechanics: Theory and Experiment,
2009(9). https://doi.org/10.1088/1742-5468/2009/09/P09010
Diaconis, P. (1976). Buffons problem with a long needle. Journal of Applied
Probability, 13(3), 614618.
Diaconis, P. (2008). The Markov chain Monte Carlo revolution. Bulletin of the
American Mathematical Society, 46(2), 179205.
https://doi.org/10.1090/S0273-0979-08-01238-X
Dunn, W. L., & Shultis, J. K. (2012). Exploring Monte Carlo Methods. Exploring
Monte Carlo Methods. https://doi.org/10.1016/C2009-0-16850-2
Gelfand, A. E., & Smith, A. F. M. (1990). Sampling-based approaches to calculating
marginal densities. Journal of the American Statistical Association, 85(410),
398409. https://doi.org/10.1080/01621459.1990.10476213
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian Data Analysis.
Chapman Texts in Statistical Science Series.
https://doi.org/10.1007/s13398-014-0173-7.2
Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation Using
Multiple Sequences. Statistical Science, 7(4), 457472.
https://doi.org/10.1214/ss/1177011136
Gong, L., & Flegal, J. M. (2016). A Practical Sequential Stopping Rule for High-
Dimensional Markov Chain Monte Carlo. Journal of Computational and
Graphical Statistics, 25(3), 684700.
https://doi.org/10.1080/10618600.2015.1044092
4
Hastings, W. K. (1970). Monte carlo sampling methods using Markov chains and
their applications. Biometrika, 57(1), 97109.
https://doi.org/10.1093/biomet/57.1.97
James Flegal, A. M., Hughes, J., Vats, D., Dai, N., & Dootika Vats, M. (2017). Title