Handbook of markov chain monte carlo (Chap.1&5) Chang liu 2014-11-17
Handbook of Markov Chain Monte Carlo (Chap. 1&5) Chang Liu 2014-11-17
Outline Chapter 1: Introduction to MCMC Chapter 5: MCMC using Hamiltonian dynamics
Outline • Chapter 1: Introduction to MCMC • Chapter 5: MCMC using Hamiltonian dynamics
Introduction to markov chain monte Carlo Charles j. geyer History Markov chains Intuitions of mcmc Elementary theory of mcmc The metropolis-Hastings-Green Algorithm
Introduction to Markov Chain Monte Carlo Charles J. Geyer • History • Markov Chains • Intuitions of MCMC • Elementary Theory of MCMC • The Metropolis-Hastings-Green Algorithm
Introduction to mcmc Brief history The invention of computer stimulates simulation methods Metropolis et al. (1953 simulated a liquid in equilibrium with its gas phase by a markov chain Hastings(1970) generalized the metropolis algorithm, and simulations following his scheme are said to use the metropolis-Hastings algorithm A special case of the metropolis-Hastings algorithm was introduced by geman and Geman (1984) Simulations following their scheme are said to use the gibbs sampler. Green(1995) generalized the metropolis-Hastings algorithm Metropolis-Hastings-Green agorithm
Introduction to MCMC • Brief history – The invention of computer stimulates simulation methods. – Metropolis et al.(1953) simulated a liquid in equilibrium with its gas phase by a Markov chain. – Hastings (1970) generalized the Metropolis algorithm, and simulations following his scheme are said to use the Metropolis-Hastings algorithm. – A special case of the Metropolis-Hastings algorithm was introduced by Geman and Geman (1984). Simulations following their scheme are said to use the Gibbs sampler. – Green (1995) generalized the Metropolis–Hastings algorithm: Metropolis–Hastings–Green algorithm
Markov chains Definition A sequence X, X2,... of random elements of some set is a markov chain if the conditional distribution of Xn+1 given X1,..., Xn depends on Xn only P(X +141,,4 )=P(n+1Xn) State space s: the set in which the Xi take values -Transition probabilities: the conditional distribution of Xn+1 given Xn p=P(Kn+1=x|Xn=x1)=1…,n, j=1,…,n( S finite) Stationary transition probabilities transition probabilities does not depend on n Initial distribution: the marginal distribution of X1
Markov Chains • Definition – A sequence 𝑋1, 𝑋2, ⋯ of random elements of some set is a Markov chain if the conditional distribution of 𝑋𝑛+1 given 𝑋1, ⋯ , 𝑋𝑛 depends on 𝑋𝑛 only. 𝑃 𝑋𝑛+1 𝑋1, ⋯ , 𝑋𝑛 = 𝑃(𝑋𝑛+1|𝑋𝑛) – State space 𝑆: the set in which the 𝑋𝑖 take values – Transition probabilities: the conditional distribution of 𝑋𝑛+1 given 𝑋𝑛 𝑝𝑖𝑗 = 𝑃 𝑋𝑛+1 = 𝑥𝑗 𝑋𝑛 = 𝑥𝑖 , 𝑖 = 1, ⋯ , 𝑛, 𝑗 = 1, ⋯ , 𝑛 (𝑆 finite) Stationary transition probabilities: transition probabilities does not depend on 𝑛. – Initial distribution: the marginal distribution of 𝑋1
Markov chains Stationarity A stochastic process is stationary if for every positive integer k the distribution of the k-tuple (Xn+i,,Xn+k) does not depend on n. A Markov chain is stationary if it is a stationary stochastic process An initial distribution is said to be stationary or invariant or equilibrium for some transition probability distribution if the markov chain specified by this initial distribution and transition probability distribution is stationary. Stationarity implies stationary transition probabilities, but not vice versa MCMC samples the equilibrium distribution
Markov Chains • Stationarity – A stochastic process is stationary if for every positive integer k the distribution of the k-tuple (𝑋𝑛+1,⋯ , 𝑋𝑛+𝑘) does not depend on 𝑛. A Markov chain is stationary if it is a stationary stochastic process. – An initial distribution is said to be stationary or invariant or equilibrium for some transition probability distribution if the Markov chain specified by this initial distribution and transition probability distribution is stationary. – Stationarity implies stationary transition probabilities, but not vice versa. – MCMC samples the equilibrium distribution
Markov chains Reversibility(detailed balance) a transition probability distribution is reversible with respect to an initial distribution if, for the markov chain X1, X2, they specify, the distribution of pairs Xixi+1 is exchangeable P(Xi=x, Xn+1=y)=P(Xi=y, Xn+1=X,VXy E S Reversibility implies stationarity but not vice versa (marginalize w.r.t. y: P(Xi=x)=P(Xn+1=x),Vx E S Reversibility plays two roles in Markov chain theory elementary transition probability constructed by all known methods that preserve a specified equilibrium distribution are reversible reversibility makes the markov chain Clt much sharper and conditions much simpler. · Functionals Functional g: S-R.g(Xi, g(X2), is usually not a Markov chain
Markov Chains • Reversibility (detailed balance) – A transition probability distribution is reversible with respect to an initial distribution if, for the Markov chain 𝑋1 , 𝑋2 , ⋯ they specify, the distribution of pairs (𝑋𝑖 ,𝑋𝑖+1 ) is exchangeable 𝑃 𝑋𝑖 = 𝑥, 𝑋𝑛+1 = 𝑦 = 𝑃 𝑋𝑖 = 𝑦, 𝑋𝑛+1 = 𝑥 , ∀𝑥, 𝑦 ∈ 𝑆 – Reversibility implies stationarity, but not vice versa. (marginalize w.r.t. 𝑦: 𝑃 𝑋𝑖 = 𝑥 = 𝑃 𝑋𝑛+1 = 𝑥 , ∀𝑥 ∈ 𝑆) – Reversibility plays two roles in Markov chain theory: elementary transition probability constructed by all known methods that preserve a specified equilibrium distribution are reversible; reversibility makes the Markov chain CLT much sharper and conditions much simpler. • Functionals – Functional 𝑔: 𝑆 → ℝ. 𝑔 𝑋1 , 𝑔 𝑋2 , ⋯ is usually not a Markov chain
tuitions of mcmc Ordinary monte carlo d. sequenceⅪ1,k2,…, special case of MCMc stationary and reversible Estimator for k=Elg(x)): An n3( Variance of the estimator: -(from CLT) 6=∑g(X)-2
Intuitions of MCMC • Ordinary Monte Carlo: – i.i.d. sequence 𝑋1 , 𝑋2 , ⋯, special case of MCMC, stationary and reversible – Estimator for : – Variance of the estimator: 𝜎 2 𝑛 (from CLT)
tuitions of mcmc MCMC Construct a stationary markov chain ,,X2,. with desired equilibrium distribution Estimator for W =elg(x) the same Variance of the estimator. 02=varig(X)+2>covlg(Xi),8(Xi+)) (does not depend on i due to stationarity Autocovariance function and autocorrelation function k H Yk: Yk= covIg(Xi), g(Xi+k)) k→→Yk/Yo =∑区X)-g(x+)-pn
Intuitions of MCMC • MCMC – Construct a stationary Markov Chain 𝑋1, 𝑋2, ⋯ with desired equilibrium distribution – Estimator for : the same – Variance of the estimator: 𝜎 2 𝑛 , (does not depend on 𝑖 due to stationarity) – Autocovariance function and autocorrelation function
tuitions of mcmc ar(1) Example Autoregressive: Xn+1=pXn +Yn, where Yn are i.i.d. N(0, t2) Stationarity requires var(Xn)=var(Xn+1)=p var(Xn)+ var(rn) L.e. var(Xn= so 0 cov (Xi, Xi+k) +2∑ p
Intuitions of MCMC • AR(1) Example – Autoregressive: – Stationarity requires i.e. , so 𝜌 2 < 1 – – Estimate E𝑋: 𝜇 ො 𝑛 = 1 𝑛 σ𝑖=1 𝑛 𝑋𝑖 , 𝜇 ො 𝑛 ∼ 𝒩 𝜇, 𝜎 2 𝑛