arrow
Volume 16, Issue 2
On Stochastic Error and Computational Efficiency of the Markov Chain Monte Carlo Method

Jun Li, Philippe Vignal, Shuyu Sun & Victor M. Calo

Commun. Comput. Phys., 16 (2014), pp. 467-490.

Published online: 2014-08

Export citation
  • Abstract

In Markov Chain Monte Carlo (MCMC) simulations, thermal equilibria quantities are estimated by ensemble average over a sample set containing a large number of correlated samples. These samples are selected in accordance with the probability distribution function, known from the partition function of equilibrium state. As the stochastic error of the simulation results is significant, it is desirable to understand the variance of the estimation by ensemble average, which depends on the sample size (i.e., the total number of samples in the set) and the sampling interval (i.e., cycle number between two consecutive samples). Although large sample sizes reduce the variance, they increase the computational cost of the simulation. For a given CPU time, the sample size can be reduced greatly by increasing the sampling interval, while having the corresponding increase in variance be negligible if the original sampling interval is very small. In this work, we report a few general rules that relate the variance with the sample size and the sampling interval. These results are observed and confirmed numerically. These variance rules are derived for the MCMC method but are also valid for the correlated samples obtained using other Monte Carlo methods. The main contribution of this work includes the theoretical proof of these numerical observations and the set of assumptions that lead to them.

  • Keywords

  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{CiCP-16-467, author = {}, title = {On Stochastic Error and Computational Efficiency of the Markov Chain Monte Carlo Method}, journal = {Communications in Computational Physics}, year = {2014}, volume = {16}, number = {2}, pages = {467--490}, abstract = {

In Markov Chain Monte Carlo (MCMC) simulations, thermal equilibria quantities are estimated by ensemble average over a sample set containing a large number of correlated samples. These samples are selected in accordance with the probability distribution function, known from the partition function of equilibrium state. As the stochastic error of the simulation results is significant, it is desirable to understand the variance of the estimation by ensemble average, which depends on the sample size (i.e., the total number of samples in the set) and the sampling interval (i.e., cycle number between two consecutive samples). Although large sample sizes reduce the variance, they increase the computational cost of the simulation. For a given CPU time, the sample size can be reduced greatly by increasing the sampling interval, while having the corresponding increase in variance be negligible if the original sampling interval is very small. In this work, we report a few general rules that relate the variance with the sample size and the sampling interval. These results are observed and confirmed numerically. These variance rules are derived for the MCMC method but are also valid for the correlated samples obtained using other Monte Carlo methods. The main contribution of this work includes the theoretical proof of these numerical observations and the set of assumptions that lead to them.

}, issn = {1991-7120}, doi = {https://doi.org/10.4208/cicp.110613.280214a}, url = {http://global-sci.org/intro/article_detail/cicp/7050.html} }
TY - JOUR T1 - On Stochastic Error and Computational Efficiency of the Markov Chain Monte Carlo Method JO - Communications in Computational Physics VL - 2 SP - 467 EP - 490 PY - 2014 DA - 2014/08 SN - 16 DO - http://doi.org/10.4208/cicp.110613.280214a UR - https://global-sci.org/intro/article_detail/cicp/7050.html KW - AB -

In Markov Chain Monte Carlo (MCMC) simulations, thermal equilibria quantities are estimated by ensemble average over a sample set containing a large number of correlated samples. These samples are selected in accordance with the probability distribution function, known from the partition function of equilibrium state. As the stochastic error of the simulation results is significant, it is desirable to understand the variance of the estimation by ensemble average, which depends on the sample size (i.e., the total number of samples in the set) and the sampling interval (i.e., cycle number between two consecutive samples). Although large sample sizes reduce the variance, they increase the computational cost of the simulation. For a given CPU time, the sample size can be reduced greatly by increasing the sampling interval, while having the corresponding increase in variance be negligible if the original sampling interval is very small. In this work, we report a few general rules that relate the variance with the sample size and the sampling interval. These results are observed and confirmed numerically. These variance rules are derived for the MCMC method but are also valid for the correlated samples obtained using other Monte Carlo methods. The main contribution of this work includes the theoretical proof of these numerical observations and the set of assumptions that lead to them.

Jun Li, Philippe Vignal, Shuyu Sun & Victor M. Calo. (2020). On Stochastic Error and Computational Efficiency of the Markov Chain Monte Carlo Method. Communications in Computational Physics. 16 (2). 467-490. doi:10.4208/cicp.110613.280214a
Copy to clipboard
The citation has been copied to your clipboard