196 T. J. DICICCIO AND B. EFRON shows_中国高校课件下载中心

点击下载：《实用统计软件》课程教学资源（阅读材料）T. DiCiccio and B.Efron（1996）, Bootstrap Confidence Intervals, Statistical Science, 3,189-228

正在加载图片...

196 T.J.DICICCIO AND B.EFRON shows how to extend the skewness definition of a to the cumulant generating function,is a normalizing multiparameter situations.This gives an estimate factor that makes gu(x)integrate to 1. that is easy to evaluate,especially in exponential The vectors u and n are in one-to-one correspon- families,and that behaves well in practice.In fact dence so that either can be used to index functions a is usually easier to estimate than zo,despite the of interest.In(4.1),for example,we used u to index latter's simpler definition. the densities g,but n to index The ABC algo- rithm involves the mapping from n to u,say 4.THE ABC METHOD (4.2) u=mu(n), We now leave one-parameter families and return which,fortunately,has a simple form in all of the to the more complicated situations that bootstrap common exponential families.Section 3 of DiCic- methods are intended to deal with.In many such cio and Efron(1992)gives function(4.2)for several situations it is possible to approximate the BCa families,as well as specifying the other inputs nec- interval endpoints analytically,entirely dispens- essary for using the ABC algorithm. ing with Monte Carlo simulations.This reduces The MLE of u in (3.1)is y,so that the MLE the computational burden by an enormous fac- of a real-valued parameter of interest 6=t(u)is tor,and also makes it easier to understand how BCa improves upon the standard intervals.The (4.3) 0=t()=t(y). ABC method("ABC"standing for approximate boot- As an example consider the bivariate normal model strap confidence intervals)is an analytic version (1.2).Here x=(B1,A1),(B2,A2),,(B20,A2o) of BCa applying to smoothly defined parameters andy=∑21(B,A,B号，B:A,Ay/20.The bivari- in exponential families.It also applies to smoothly ate normal is a five-parameter exponential family defined nonparametric problems,as shown in Sec- with tion 6.DiCiccio and Efron (1992)introduced the ABC method,which is also discussed in Efron and (4.4)u=(入1，入2，A足+「11，入1入2+「12，λ2+「22)/ Tibshirani (1993). Thus the correlation coefficient is the function t(u) The BC endpoints(2.3)depend on the bootstrap given by c.d.f.G and estimates of the two parameters a and zo.The ABC method requires one further estimate, (4.5) L4-u1八2 of the nonlinearity parameter ca,but it does not in- [(3-(4屿-]2 volve G. 6=t()is seen to be the usual sample correlation The standard interval (1.1)depends only on the coefficient. two quantities ()The ABC intervals depend We denote the p x p covariance matrix of y by on the five quantities (0,G,a,2o,c).Each of the (u)=cov.{y,and letΣ=(i),the MLE ofΣ. three extra numbers(a,2o,c)corrects a deficiency The delta-method estimate of standard error for 6= of the standard method,making the ABC intervals t()depends on Let i denote the gradient vector second-order accurate as well as second-order cor- of =t(u)at u=, rect. The ABC system applies within multiparame- (4.6) ou: ter exponential families,which are briefly reviewed below.This framework includes most familiar Then parametric situations:normal,binomial,Poisson, (4.7) 6=(t'2t)/2 gamma,multinomial,ANOVA,logistic regression, contingency tables,log-linear models,multivariate is the parametric delta-method estimate of standard normal problems,Markov chains and also nonpara- error,and it is also the usual Fisher information metric situations as discussed in Section 6. standard error estimate. The density function for a p-parameter exponen- The o values for the standard intervals in Tables tial family can be written as 2 and 3 were found by numerical differentiation, using (4.1) gu(x)=exp[n'y-(n)] (4.8) t(+se;)-t(i-se;) dμi 28 where x is the observed data and y=Y(x)is a p- dimensional vector of sufficient statistics;n is the for a small value of s,with ei the ith coordinate p-dimensional natural parameter vector;u is vector.The covariance matrix is simple to calcu- the expectation parameter u=E{y};and (n), late in most of the familiar examples,as shown in196 T. J. DICICCIO AND B. EFRON shows how to extend the skewness definition of aˆ to multiparameter situations. This gives an estimate that is easy to evaluate, especially in exponential families, and that behaves well in practice. In fact a is usually easier to estimate than z0 , despite the latter’s simpler definition. 4. THE ABC METHOD We now leave one-parameter families and return to the more complicated situations that bootstrap methods are intended to deal with. In many such situations it is possible to approximate the BCa interval endpoints analytically, entirely dispensing with Monte Carlo simulations. This reduces the computational burden by an enormous factor, and also makes it easier to understand how BCa improves upon the standard intervals. The ABC method (“ABC” standing for approximate bootstrap confidence intervals) is an analytic version of BCa applying to smoothly defined parameters in exponential families. It also applies to smoothly defined nonparametric problems, as shown in Section 6. DiCiccio and Efron (1992) introduced the ABC method, which is also discussed in Efron and Tibshirani (1993). The BCa endpoints (2.3) depend on the bootstrap c.d.f. Gˆ and estimates of the two parameters a and z0 . The ABC method requires one further estimate, of the nonlinearity parameter cq , but it does not involve Gˆ. The standard interval (1.1) depends only on the two quantities θˆ; σˆ . The ABC intervals depend on the five quantities θˆ; σˆ; aˆ; zˆ0 ; cˆq . Each of the three extra numbers aˆ; zˆ0 ; cˆq corrects a deficiency of the standard method, making the ABC intervals second-order accurate as well as second-order correct. The ABC system applies within multiparameter exponential families, which are briefly reviewed below. This framework includes most familiar parametric situations: normal, binomial, Poisson, gamma, multinomial, ANOVA, logistic regression, contingency tables, log-linear models, multivariate normal problems, Markov chains and also nonparametric situations as discussed in Section 6. The density function for a p-parameter exponential family can be written as 4:1 gµx = expη 0y − ψη where x is the observed data and y = Yx is a pdimensional vector of sufficient statistics; η is the p-dimensional natural parameter vector; µ is the expectation parameter µ = Eµy; and ψη, the cumulant generating function, is a normalizing factor that makes gµx integrate to 1. The vectors µ and η are in one-to-one correspondence so that either can be used to index functions of interest. In (4.1), for example, we used µ to index the densities g, but η to index ψ. The ABC algorithm involves the mapping from η to µ, say 4:2 µ = muη; which, fortunately, has a simple form in all of the common exponential families. Section 3 of DiCiccio and Efron (1992) gives function (4.2) for several families, as well as specifying the other inputs necessary for using the ABC algorithm. The MLE of µ in (3.1) is µˆ = y, so that the MLE of a real-valued parameter of interest θ = tµ is 4:3 θˆ = tµˆ = ty: As an example consider the bivariate normal model (1.2). Here x = B1 ; A1 , B2 ; A2 ;: : :; B20; A20 and y = P20 i=1 Bi , Ai , B2 i , BiAi , A2 i 0 /20. The bivariate normal is a five-parameter exponential family with 4:4 µ = λ1 ; λ2 ; λ 2 1 + 011; λ1λ2 + 012; λ 2 2 + 022 0 : Thus the correlation coefficient is the function tµ given by 4:5 θ = µ4 − µ1µ2 µ3 − µ 2 1 µ5 − µ 2 2 1/2 y θˆ = tµˆ is seen to be the usual sample correlation coefficient. We denote the p × p covariance matrix of y by 6µ = covµy, and let 6ˆ = 6µˆ , the MLE of 6. The delta-method estimate of standard error for θˆ = tµˆ depends on 6ˆ . Let t˙ denote the gradient vector of θ = tµ at µ = µˆ, 4:6 t˙ = : : :; ∂t ∂µi ;: : : 0 µ=µˆ : Then 4:7 σˆ = t˙ 06ˆ t˙ 1/2 is the parametric delta-method estimate of standard error, and it is also the usual Fisher information standard error estimate. The σˆ values for the standard intervals in Tables 2 and 3 were found by numerical differentiation, using 4:8 ∂t ∂µi µˆ := tµˆ + εei − tµˆ − εei 2ε for a small value of ε, with ei the ith coordinate vector. The covariance matrix 6ˆ is simple to calculate in most of the familiar examples, as shown in

<<向上翻页向下翻页>>

点击下载：《实用统计软件》课程教学资源（阅读材料）T. DiCiccio and B.Efron（1996）, Bootstrap Confidence Intervals, Statistical Science, 3,189-228