正在加载图片...
Ferguson:Inconsistent Maximum Likelihood Estimate 833 Therefore,with probability one 5.There is a function K(x)=0 with finite expectation, lim inf max,号l.(0)≥log2 1 1 EeoK(x)=K(x)f(x|0o)dx<o, n→0ss1n 1-Mn such that +lim inf log6Ma】 月001 log )for all x and all 0. f(x|0o) Whatever be the value of 0,M converges to 1 at a certain rate,the slowest rate being for the triangular (0 0) (To get global consistency this assumption must be made since this distribution has smaller mass than any of the for all 0oE,but K(x)may depend on 0o.)This condition others in sufficiently small neighborhoods of 1.Thus we is therefore not satisfied in the example.It would be can chooseδ(0)→0 so fast as0→1that(1/n)log(1 satisfied if the parameter space were limited to,say,[0, -M)/(M))>o with probability one for the triangular 1-e]since the density would then be bounded. and hence for all other possible true values of 0,com- pleting the proof. 3.A DIFFERENTIABLE MODIFICATION How fast is fast enough?Take 0 =0 and note that if 0<e<1, Without much difficulty,this example can be modified so that the densities satisfy Cramer's conditions for the ∑Po(n(1-Mn)>e)=∑Po(Mn<1-∈n-14) existence of an asymptotically efficient sequence of roots of the likelihood equation.This amounts to modifying the =∑Po(X<1-∈n-14)n distributions so that the resulting density,f(x 0),(a)has two continuous derivatives that may be passed beneath =∑(1-e2n-12)” the integral sign in ff(x 0)dx =1,(b)has finite and positive Fisher information at all points 0 interior to and (c)satisfies a2/a02 f(x 0)<K(x)in some neigh- ≤∑exp(-是e2Vn)<o borhood of the true 0o,where K(x)is 0o-integrable.The simplest modification is to use the family of beta densities so that by the Borel-Cantelli Lemma,n(1-M)0 on [0,1]as follows.Let g denote the density of the Be(a, with probability one.Therefore,the choice B)distribution, 8(0)=(1-0)exp(-(1-0)-4)+1 g收a=f%t0ra-g-ane, gives a 8(0)that is continuous,decreasing,with 5(0)= 1,0<6(0)<1-0for0<0<1,and and let f be the density of the mixture of a Be(1,1) ,1-Mn= (uniform)and a Be(a,β), δ(Mn)n(1-Mn)4-n f(x)=0g(x|1,1)+(1-0)g(x|a(0),B(0), with probability one. where a(0)and B(0)are chosen to be twice continuously Although the maximum likelihood method fails asymp- differentiable and to give the density a very sharp peak totically in this example,other methods of estimation can close to 0,say mean 0 and variance tending to 0 suffi- yield consistent estimates.Bayes methods,for example, ciently fast as 0-1.Thus we take =[1], would be strongly consistent for almost all e with respect to the prior distribution,as implied by a general argument c(0)=0δ(0),andB(6)=(1-0)8(0). of Doob (1948).Simpler computationally,but not gen- erally as accurate,are the estimates given by the method The particular form of 5(0)is not important.What is of moments or minimum x2 based on a finite number of important is that cells,and such methods can be made to yield consistent 1.8(0)is twice continuously differentiable, estimates.Estimates that are consistent may also be con- 2.(1 -0)8(0),and hence 5(0),is increasing on [1), structed by the minimum distance method of Wolfowitz 3.8()>2 (to obtain identifiability),and (1957). 4.δ(o)tends to o sufficiently fast as0→l. If one simple condition were added to conditions 1 through 4 of the introduction,the argument of Wald(1949)For 0 1,f(x|1)is defined to be g(x 1,1).Then f(x would imply the strong consistency of the maximum like-0)is continuous in 0 e [1]for each x,and for the true lihood estimates.This is a uniform boundedness condi- o∈(,l),Cramer's conditions are satisfied. tion that may be stated as follows:Let 0o denote the true The proof that every maximum likelihood sequence value of the parameter.Then the maximum likelihood converges to 1 with probability one as n-o no matter estimate 0r converges to 0o with probability one provided what the true value of 0 E[,1]is completely analogous conditions 1 through 4 hold and to the corresponding proof in Section 2,except that inFerguson: Inconsistent Maximum Likelihood Estimate 833 Therefore, with probability one lim inf max 1 1n(0))-logI n xo-oc 1 n 2 + lrn inf I logI M n --> n 8(Mn) Whatever be the value of 0, Mn converges to 1 at a certain rate, the slowest rate being for the triangular (0 = 0) since this distribution has smaller mass than any of the others in sufficiently small neighborhoods of 1. Thus we can choose 8(0) -> 0 so fast as 0 1-> that (1/n) log((1 - Mn)l(Mn)) -X oo with probability one for the triangular and hence for all other possible true values of 0, com￾pleting the proof. How fast is fast enough? Take 0 = 0 and note that if 0< E < 1, Y, PO(N_( 1- Mn ) > e) = Po(Mn < 1 - n n n = >Po(X< 1 en - l/4)n n = E(-2 ,2n-1/2) n n nEexp(- 12 E2 N/) < ?? n so that by the Borel-Cantelli Lemma, "G_(1 - MO) -> 0 with probability one. Therefore, the choice 8(0) = (1 - 0)exp(-(1 - O)-4) + 1 gives a 8(0) that is continuous, decreasing, with 8(0) = 1, 0 < 8(0) < 1 - 0 for 0 < 0 < 1, and 1 1 -Mn 1 1 - log - - I o n 8(Mn) n(I - Mn)4 n with probability one. Although the maximum likelihood method fails asymp￾totically in this example, other methods of estimation can yield consistent estimates. Bayes methods, for example, would be strongly consistent for almost all 0 with respect to the prior distribution, as implied by a general argument of Doob (1948). Simpler computationally, but not gen￾erally as accurate, are the estimates given by the method of moments or minimum x2 based on a finite number of cells, and such methods can be made to yield consistent estimates. Estimates that are consistent may also be con￾structed by the minimum distance method of Wolfowitz (1957). If one simple condition were added to conditions 1 through 4 of the introduction, the argument of Wald (1949) would imply the strong consistency of the maximum like￾lihood estimates. This is a uniform boundedness condi￾tion that may be stated as follows: Let 0o denote the true value of the parameter. Then the maximum likelihood estimate f0,, converges to 0o with probability one provided conditions 1 through 4 hold and 5. There is a function K(x) ? 0 with finite expectation, Eo0K(x) = f K(x)f(x I Oo) dx < 00, such that log f(x I 0) < K(x) for all x and all 0. (To get global consistency this assumption must be made for all 0o E 0, but K(x) may depend on 00.) This condition is therefore not satisfied in the example. It would be satisfied if the parameter space were limited to, say, [0, 1 - E] since the density would then be bounded. 3. A DIFFERENTIABLE MODIFICATION Without much difficulty, this example can be modified so that the densities satisfy Cramer's conditions for the existence of an asymptotically efficient sequence of roots of the likelihood equation. This amounts to modifying the distributions so that the resulting density, f(x I 0), (a) has two continuous derivatives that may be passed beneath the integral sign in f f(x I 0)dx 1, (b) has finite and positive Fisher information at all points 0 interior to 0, and (c) satisfies I a2/ao2 f(x I 0) I < K(x) in some neigh￾borhood of the true 00, where K(x) is Oo-integrable. The simplest modification is to use the family of beta densities on [0, 1] as follows. Let g denote the density of the Be(oa, ,B) distribution, 9(xI t,B) r (t + ?) x -I(l - x)-l I[O,1](X), F(oL-)F(13) and let f be the density of the mixture of a Be(l, 1) (uniform) and a Be(ot, 13), f(x I 0) = Og(x I 1, 1) + (1 - 0)g(x I a(0), ,B(0)), where a(0) and 13(0) are chosen to be twice continuously differentiable and to give the density a very sharp peak close to 0, say mean 0 and variance tending to 0 suffi￾ciently fast as 0 -> 1. Thus we take 0 = [2, 1], a(0) = 08(0), and 13(0) = (1 - 0)8(0). The particular form of 8(0) is not important. What is important is that 1. 8(0) is twice continuously differentiable, 2. (1 - 0)8(0), and hence 8(0), is increasing on [1, 1), 3. 8(2) > 2 (to obtain identifiability), and 4. 5(0) tends to oo sufficiently fast as 0 -> 1. For0 = 1, f(x I 1) is defined to be g(x I 1, 1). Then f(x I 0) is continuous in 0 E [L, 1] for each x, and for the true 00 E (2, 1), Cramer's conditions are satisfied. The proof that every maximum likelihood sequence converges to 1 with probability one as n -X0 no matter what the true value of 0 E [1, 1] is completely analogous to the corresponding proof in Section 2, except that in
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有