正在加载图片...
194 T.J.DICICCIO AND B.EFRON formula(4.9)of Section 4,is moderately large.Sup- normal-theory standard intervals for the correlation pose we think we have moved 1.645 standard errors coefficient are much more accurate if constructed on to the right of中,to the scale =tanh()and then transformed back 6=6+1.645061 to give an interval for 6 itself.Transformation in- variance means that the BC intervals cannot be Actually though,with a=0.105, fooled by a bad choice of scale.To put it another way, the statistician does not have to search for a trans- 0%=(1+1.645a)o6=1.17306, formation like tanh in applying the BCa method according to(2.5).For calculating a confidence level, In summary,BCa produces confidence intervals is really only 1.645/1.173 =1.40 standard er- for 6 from the bootstrap distribution of requir- rors to the right of o,considerably less than 1.645. ing on the order of 2,000 bootstrap replications Formula(2.3)automatically corrects for an acceler- These intervals are transformation invariant ating standard error.The next section gives a ge- and exactly correct under the normal transforma- ometrical interpretation of a,and also of the BC tion model(2.5);in general they are second-order formula(2.3). accurate and correct. The peculiar-looking formula(2.3)for the BCa endpoints is designed to give exactly the right an- 3.THE ACCELERATION a swer in situation(2.5),and to give it automatically The acceleration parameter a appearing in the in terms of the bootstrap distribution of *Notice, BCa formula(3.2)looks mysterious.Its definition for instance,that the normalizing transformation in(2.5)involves an idealized transformation to nor- d=m()is not required in (2.3).By comparison, mality which will not be known in practice.Fortu- the standard interval works perfectly only under the nately a enjoys a simple relationship with Fisher's more restrictive assumption that score function which makes it easy to estimate.This (2.9) 8~N(0,σ2), section describes the relationship in the context of one-parameter families.In doing so it also allows with o2 constant.In practice we do not expect ei- us better motivation for the peculiar-looking BCa ther (2.9)or (2.5)to hold exactly,but the broader formula(2.3). assumptions (2.5)are likely to be a better approxi- Suppose then that we have a one-parameter fam- mation to the truth.They produce intervals that are ily of c.d.f.'s Ge()on the real line,with 6 being an an order of magnitude more accurate,as shown in estimate of 0.In the relationships below we assume Section 8. that6 behaves asymptotically like a maximum like- Formula (2.5)generalizes (2.9)in three ways,by lihood estimator,with respect to a notional sample allowing bias,nonconstant standard error and a size n,as made explicit in(5.3)of Efron(1987).As normalizing transformation.These three extensions a particular example,we will consider the case are necessary and sufficient to give second-order accuracy, (3.1) Gamman n=10, n (2.10) Prob{0<0BC.[a]}=a+0(1/n), where Gamma indicates a standard gamma variate compared with Prob{<0sTAN[a]}=a+0(1/n), with density tn-1 exp{-t)/r(n)for t>0. where n is the sample size in an i.i.d.sampling situ- Having observed 6,we wonder with what confi- ation.This result is stated more carefully in Section dence we can reject a trial value 00 of the parameter 8,which also shows the second-order correctness of 6.In the gamma example(3.1)we might have the BC intervals.Hall (1988)was the first to es- tablish (2.10). (3.2) 0=1and0o=1.5. The BC intervals are transformation invariant. The easy answer from the bootstrap point of view is If we change the parameter of interest from 0 to given in terms of the bootstrap c.d.f.G(c)=Ga(c). some monotone function of 6,=m(),likewise We can define the bootstrap confidence value to be changing0to中=m(8)and to中*=m(),then the a-level BC endpoints change in the same way, (3.3) a=G(0o)=Ga(0o): (2.11) BC [a]m(0BC,[a]) However,this will usually not agree with the more familiar hypothesis-testing confidence level for a The standard intervals are not transformation in one-parameter problem,say variant,and this accounts for some of their practi- cal difficulties.It is well known,for instance,that (3.4) a=1-Ge(),194 T. J. DICICCIO AND B. EFRON formula (4.9) of Section 4, is moderately large. Sup￾pose we think we have moved 1.645 standard errors to the right of φˆ, to φe = φˆ + 1:645σφˆ : Actually though, with a = 0:105, σeφ = 1 + 1:645a‘σφˆ = 1:173σφˆ; according to (2.5). For calculating a confidence level, φe is really only 1:645/1:173 = 1:40 standard er￾rors to the right of φˆ, considerably less than 1:645. Formula (2.3) automatically corrects for an acceler￾ating standard error. The next section gives a ge￾ometrical interpretation of a, and also of the BCa formula (2.3). The peculiar-looking formula (2.3) for the BCa endpoints is designed to give exactly the right an￾swer in situation (2.5), and to give it automatically in terms of the bootstrap distribution of θˆ ∗ . Notice, for instance, that the normalizing transformation φˆ = mθˆ‘ is not required in (2.3). By comparison, the standard interval works perfectly only under the more restrictive assumption that 2:9‘ θˆ ∼ Nθ; σ 2 ‘; with σ 2 constant. In practice we do not expect ei￾ther (2.9) or (2.5) to hold exactly, but the broader assumptions (2.5) are likely to be a better approxi￾mation to the truth. They produce intervals that are an order of magnitude more accurate, as shown in Section 8. Formula (2.5) generalizes (2.9) in three ways, by allowing bias, nonconstant standard error and a normalizing transformation. These three extensions are necessary and sufficient to give second-order accuracy, 2:10‘ Probθ < θˆBCa ’α• = α + O1/n‘; compared with Probθ < θˆ STAN’α• = α + O1/ √ n‘, where n is the sample size in an i.i.d. sampling situ￾ation. This result is stated more carefully in Section 8, which also shows the second-order correctness of the BCa intervals. Hall (1988) was the first to es￾tablish (2.10). The BCa intervals are transformation invariant. If we change the parameter of interest from θ to some monotone function of θ, φ = mθ‘, likewise changing θˆ to φˆ = mθˆ‘ and θˆ ∗ to φˆ ∗ = mθˆ ∗ ‘, then the α-level BCa endpoints change in the same way, 2:11‘ φˆ BCa ’α = mθˆBCa ’α‘: The standard intervals are not transformation in￾variant, and this accounts for some of their practi￾cal difficulties. It is well known, for instance, that normal-theory standard intervals for the correlation coefficient are much more accurate if constructed on the scale φ = tanh−1 θ‘ and then transformed back to give an interval for θ itself. Transformation in￾variance means that the BCa intervals cannot be fooled by a bad choice of scale. To put it another way, the statistician does not have to search for a trans￾formation like tanh−1 in applying the BCa method. In summary, BCa produces confidence intervals for θ from the bootstrap distribution of θˆ ∗ , requir￾ing on the order of 2,000 bootstrap replications θˆ ∗ . These intervals are transformation invariant and exactly correct under the normal transforma￾tion model (2.5); in general they are second-order accurate and correct. 3. THE ACCELERATION a The acceleration parameter a appearing in the BCa formula (3.2) looks mysterious. Its definition in (2.5) involves an idealized transformation to nor￾mality which will not be known in practice. Fortu￾nately a enjoys a simple relationship with Fisher’s score function which makes it easy to estimate. This section describes the relationship in the context of one-parameter families. In doing so it also allows us better motivation for the peculiar-looking BCa formula (2.3). Suppose then that we have a one-parameter fam￾ily of c.d.f.’s Gθ θˆ‘ on the real line, with θˆ being an estimate of θ. In the relationships below we assume that θˆ behaves asymptotically like a maximum like￾lihood estimator, with respect to a notional sample size n, as made explicit in (5.3) of Efron (1987). As a particular example, we will consider the case 3:1‘ θˆ ∼ θ Gamman n ; n = 10; where Gamma indicates a standard gamma variate with density t n−1 exp−t•/0n‘ for t > 0. Having observed θˆ, we wonder with what confi- dence we can reject a trial value θ0 of the parameter θˆ. In the gamma example (3.1) we might have 3:2‘ θˆ = 1 and θ0 = 1:5: The easy answer from the bootstrap point of view is given in terms of the bootstrap c.d.f. Gˆ c‘ = Gθˆc‘. We can define the bootstrap confidence value to be 3:3‘ α˜ = Gˆ θ0 ‘ = Gθˆθ0 ‘: However, this will usually not agree with the more familiar hypothesis-testing confidence level for a one-parameter problem, say 3:4‘ αˆ = 1 − Gθ0 θˆ‘;
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有