1. ESTIMATES AS FUNCTIONALS OF FN OR PN Example 1.13 (Z-functional derived from likelihood).A maximum likelihood estimator:for P on(,A),suppose that P={Po:0cR}is a regular parametric model with vector scores function le(;0).Then for general P,not necessarily in the model P,consider T defined by (1) io(z;T(P))dP(z)=0. Then /i(c:Te》证.=0 defines T(Pn).For estimation of location in one dimension with I(x;0)=(x-0)and =-f/f, these become v(-T(F))dF(x)=0 and (x-T(Fn))dFn(x)=0. We expect that often the value T(P)ee satisfying (1)also satisfies T(P)=argmineeeK(P,Po). Here is a heuristic argument showing why this should be true:Note that for many cases we have On=argmaxon-In(0)=argmaxoPn(log0) -p argmaxeP(log0)=argmaxe logpe(x)dP(x). Now P(log pe)=P(logp)+Plog P(logp)-Plog P(logp)-K(P;Pe). Thus argmaxe log pe(x)dP(x)=argminoK(P,Pe)≡f(P). If we can interchange differentiation and integration it follows that VoK(P,Po)=p(z)io(z:0)du(x)=io(z:0)dP(z), so the relation (1)is obtained by setting this gradient vector equal to 0. Example 1.14 A bootstrap functional:let T(F)be a functional with estimator T(Fn),and con- sider estimating the distribution function of vn(T(Fn)-T(F)), Hn(F;)=Pr(Vn(T(En)-T(F))<). A natural estimator is Hn(Fn,).1. ESTIMATES AS FUNCTIONALS OF FN OR PN 5 Example 1.13 (Z-functional derived from likelihood). A maximum likelihood estimator: for P on (X , A), suppose that P = {Pθ : θ ∈ Θ ⊂ Rk} is a regular parametric model with vector scores function ˙ lθ(·; θ). Then for general P, not necessarily in the model P, consider T defined by # ˙ (1) lθ(x; T(P))dP(x) = 0. Then # ˙ lθ(x; T(Pn))dPn(x)=0 defines T(Pn). For estimation of location in one dimension with ˙ l(x; θ) = ψ(x − θ) and ψ ≡ −f% /f, these become # ψ(x − T(F))dF(x) = 0 and # ψ(x − T(Fn))dFn(x)=0. We expect that often the value T(P) ∈ Θ satisfying (1) also satisfies T(P) = argminθ∈ΘK(P, Pθ). Here is a heuristic argument showing why this should be true: Note that for many cases we have ˆθn = argmaxθn−1ln(θ) = argmaxθPn(log θ) →p argmaxθP(log θ) = argmaxθ # log pθ(x)dP(x). Now P(log pθ) = P(log p) + P log %pθ p & = P(log p) − P log % p pθ & = P(log p) − K(P, Pθ). Thus argmaxθ # log pθ(x)dP(x) = argminθK(P, Pθ) ≡ θ(P). If we can interchange differentiation and integration it follows that ∇θK(P, Pθ) = # p(x)˙ lθ(x; θ)dµ(x) = # ˙ lθ(x; θ)dP(x), so the relation (1) is obtained by setting this gradient vector equal to 0. Example 1.14 A bootstrap functional: let T(F) be a functional with estimator T(Fn), and consider estimating the distribution function of √n(T(Fn) − T(F)), Hn(F; ·) = PF ( √n(T(Fn) − T(F)) ≤ ·). A natural estimator is Hn(Fn, ·)