Ch. 10 Autocorrelated Disturbances In a time-series setting, a common problem is autocorrelation, or serial corre- lation of the disturbance across periods. See the plot of the residuals at Figure 121onp.251 1 Stochastic process particularly important aspect of real observable phenomena, which the random variables concept cannot accommodate, is their time dimension; the concept of random variable is essential static. A number of economic phenomena for which we need to formulate probability models come in the form of dynamic processes for which we have discrete sequence of observations in time. The problem we have to face is extend the simple probability model 重={f(x;),6∈O}, to one which enables us to model dynamic phenomena. We have already moved in this direction by proposing the random vector probability model 6),6∈} The way we viewed this model so far has been as representing different char acteristics of the phenomenon in question in the form of the jointly distributed r.v.s X1, X2,..., Xr. If we reinterpret this model as representing the same char acteristic but at suce essive points in time then this can be viewed as a dynamic probability model. With this as a starting point let us consider the dynamic probability model in the context of(S, F, P) 1.1 The Concept of a Stochastic Process The natural way to make the concept of a random variable dynamic is to extend its domain by attaching a date to the elements of the sample space S Definition 1 Let(S, F, p) be a probability space and T an index set of real numbers and define the function X(, by X(, :SxT-R. The ordered sequence of random variables (X(, t), tET is called a stochastic process
Ch. 10 Autocorrelated Disturbances In a time-series setting, a common problem is autocorrelation, or serial correlation of the disturbance across periods. See the plot of the residuals at Figure 12.1 on p. 251. 1 Stochastic Process A particularly important aspect of real observable phenomena, which the random variables concept cannot accommodate, is their time dimension; the concept of random variable is essential static. A number of economic phenomena for which we need to formulate probability models come in the form of dynamic processes for which we have discrete sequence of observations in time. The problem we have to face is extend the simple probability model, Φ = {f(x; θ), θ ∈ Θ}, to one which enables us to model dynamic phenomena. We have already moved in this direction by proposing the random vector probability model Φ = {f(x1, x2, ..., xT ; θ), θ ∈ Θ}. The way we viewed this model so far has been as representing different characteristics of the phenomenon in question in the form of the jointly distributed r.v.’s X1, X2, ..., XT . If we reinterpret this model as representing the same characteristic but at successive points in time then this can be viewed as a dynamic probability model. With this as a starting point let us consider the dynamic probability model in the context of (S, F,P). 1.1 The Concept of a Stochastic Process The natural way to make the concept of a random variable dynamic is to extend its domain by attaching a date to the elements of the sample space S. Definition 1: Let (S, F,P) be a probability space and T an index set of real numbers and define the function X(·, ·) by X(·, ·) : S × T → R. The ordered sequence of random variables {X(·,t),t ∈ T } is called a stochastic process. 1
This definition suggests that for a stochastic process ( X(, t),tET, for each tET, X(, t)represents a random variable on S. On the other hand, for each s in S, X(s, represents a function of t which we call a realization of the process X(s, t) for given s and t is just a real number Three main elements of a stochastic process iX(, t,tET are 1. its range space(sometimes called the state space), usually R 2. the index T, usually one of R, R+=[0, oo),and 3. the dependence structure of the r v's X(, t),tETI In what follows a stochastic process will be denoted by XL, tET(s is dropped and X(t)is customary used as continuous stochastic process) and we are concerning exclusively on discrete stochastic process The dependence structure of (Xt, tET, in direct analogy with the case of a random vector, should be determined by the joint distribution of the process The question arises, however, since T is commonly an infinite set, do we need an infinite dimensional distribution to define the structure of the process? This question was tackled by Kolmogorov(1933) who showed that when the stochastic process satisfies certain regularity conditions the answer is definitely no. In particular, if we define the tentative joint distribution of the process for the subset(t1<t2<…<tr) of T by F(Xn,Xt2,…Xxtr)=Pr(Mn1≤x1,Mt2≤ 2,..., XtT <.T), then if the stochastic process X,, t T satisfies the condi- tions: 1. symmetry:F(X1,Xt2,…Xt)=F(Xt1,Xt2,…Xtn) where j1,j2,… is any permutation of the indices 1, 2,,T(i.e. reshuffling the ordering of the index does not change the distribution) 2. Compatibility: limxr-o0 F(Xt,, Xt2, Xtr)= F(Xt, Xt,, Xt-i(i.e the dimensionality of the joint distribution can be reduced by marginalisation) there exist a probability space(S, F, p) and a stochastic process ( Xt, tet de- fined on it whose finite dimensional distribution is the distribution F(Xt, Xt2,., Xtr) as defined above. That is, the probability structure of the stochastic process IXt, t e T is completely specified by the joint distribution of F(Xt, Xtx,. Xt for all values of T(a positive integer) and any subset(t1, t2,.,tr)of T
This definition suggests that for a stochastic process {X(·,t),t ∈ T }, for each t ∈ T , X(·,t) represents a random variable on S. On the other hand, for each s in S, X(s, ·) represents a function of t which we call a realization of the process. X(s,t) for given s and t is just a real number. Three main elements of a stochastic process {X(·,t),t ∈ T } are: 1. its range space (sometimes called the state space), usually R; 2. the index T , usually one of R, R+ = [0, ∞), and 3. the dependence structure of the r.v.’s {X(·,t),t ∈ T }. In what follows a stochastic process will be denoted by {Xt ,t ∈ T } (s is dropped and X(t) is customary used as continuous stochastic process) and we are concerning exclusively on discrete stochastic process. The dependence structure of {Xt ,t ∈ T }, in direct analogy with the case of a random vector, should be determined by the joint distribution of the process. The question arises, however, since T is commonly an infinite set, do we need an infinite dimensional distribution to define the structure of the process ? This question was tackled by Kolmogorov (1933) who showed that when the stochastic process satisfies certain regularity conditions the answer is definitely ’no’. In particular, if we define the ’tentative’ joint distribution of the process for the subset (t1 < t2 < ... < tT ) of T by F(Xt1 , Xt2 , ..., XtT ) = Pr(Xt1 ≤ x1, Xt2 ≤ x2, ..., XtT ≤ xT ), then if the stochastic process {Xt ,t ∈ T } satisfies the conditions: 1. symmetry: F(Xt1 , Xt2 , ..., XtT ) = F(Xtj1 , Xtj2 , ..., XtjT ) where j1, j2, ..., jT is any permutation of the indices 1, 2, ..., T (i.e. reshuffling the ordering of the index does not change the distribution). 2. Compatibility: limxT →∞ F(Xt1 , Xt2 , ..., XtT ) = F(Xt1 , Xt2 , ..., XtT −1 ) (i.e. the dimensionality of the joint distribution can be reduced by marginalisation); there exist a probability space (S, F,P) and a stochastic process {Xt ,t ∈ T } de- fined on it whose finite dimensional distribution is the distribution F(Xt1 , Xt2 , ..., XtT ) as defined above. That is, the probability structure of the stochastic process {Xt ,t ∈ T } is completely specified by the joint distribution of F(Xt1 , Xt2 , ..., XtT ) for all values of T (a positive integer) and any subset (t1,t2, ...,tT ) of T . 2
Given that, for a specific t, Xt is a random variable, we can denote its dis- tribution and density function by F(Xt) and f(Xt) respectively. Moreover the mean,variance and higher moments of Xt(as a r v. can be defined as standard form as E(XL tf()dart=pt E(x-P=/(-(a=P E(X rt,r≥1,t∈T. The linear dependence measures between Xt, and Xt u(ti, ti)=EI(X4, -Ht,)(XL, -Ht, ti, tET, is now called the autocovariance function. In standardized form 4)=a(4,t called is autocorrelation function These numerical characteristics of the tochastic process Xt, t ET play an important role in the analysis of the pro- cess and its application to modeling real observable phenomena. We say that {Xt,t∈T} is an uncorrelated process if r(t,t)=0 for any t,t∈T,t≠t Example: One of the most important example of a stochastic process is the normal process The stochastic process Xt, teT) is said to be normal(or Gaussian) if any finite subset of T, say t1, t2, .,tr, (X, Xt,, XtT X has a multivariate normal distribution ie 1 f(u, X,2,,Xt)=(2x)-T/2VrI eP 2 Xr-Ar)vrl(Xr-ur)), t2(t1)v(t1,t2) v(t,t)2(t2) v(t2, tr) uT= e(Xr) u(tr, t1)
Given that, for a specific t, Xt is a random variable, we can denote its distribution and density function by F(Xt) and f(Xt) respectively. Moreover the mean, variance and higher moments of Xt (as a r.v.) can be defined as standard form as: E(Xt) = Z xt xtf(xt)dxt = µt E(Xt − µt) 2 = Z xt (xt − µt) 2 f(xt)dxt = v 2 (t) E(Xt) r = µrt , r ≥ 1, t ∈ T . The linear dependence measures between Xti and Xtj v(ti ,tj ) = E[(Xti − µti )(Xtj − µtj )], ti ,tj ∈ T , is now called the autocovariance function. In standardized form r(ti ,tj) = v(ti ,tj) v(ti)v(tj ) , ti ,tj ∈ T , is called is autocorrelation function. These numerical characteristics of the stochastic process {Xt ,t ∈ T } play an important role in the analysis of the process and its application to modeling real observable phenomena. We say that {Xt ,t ∈ T } is an uncorrelated process if r(ti ,tj) = 0 for any ti ,tj ∈ T ,ti 6= tj . Example: One of the most important example of a stochastic process is the normal process. The stochastic process {Xt ,t ∈ T } is said to be normal (or Gaussian) if any finite subset of T , say t1,t2, ...,tT , (Xt1 , Xt2 , ..., XtT ) ≡ X0 T has a multivariate normal distribution, i.e. f(Xt1 , Xt2 , ..., XtT ) = (2π) −T/2 |VT | −1/2 exp[− 1 2 (XT − µT ) 0V−1 T (XT − µT )], where µT = E(XT ) = µ1 µ2 . . . µT VT = v 2 (t1) v(t1,t2) . . . v(t1,tT ) v(t2,t1) v 2 (t2) . . . v(t2,tT ) . . . . . . . . . . . . . . . . . . v(tT ,t1) . . . . v 2 (tT ) . 3
As in the case of a normal random variable. the distribution of a normal stochas. tic process is characterized by the first two moment but now they are function of t the analysis of stochastic process we only have a single realization of the process and we will have to deduce the value of ut and v(t) with the help of a single observation(which is impossible ! The main purpose of the next three sections is to consider various special forms of stochastic process where we can construct probability models which are manageable in the context of statistical inference. Such manageability is achieved by imposing certain restrictions which enable us to reduce the number of unknown parameters involved in order to be able to deduce their value from a single real ization, These restrictions come in two forms 1. restriction on the time-heterogeneity of the process; and 2. restriction on the memory of the process 1.2 Restricting the time-heterogeneity of a stochastic pro- cess For an arbitrary stochastic process Xt, teT the distribution function F(XL; 01) depends on t with the parameter Bt characterizing it being function of t as well That is, a stochastic process is time-heterogeneous in general. This, however raises very difficult issues in modeling real phenomena because usually we only have one observation for each t. Hence in practice we will have to estimate 0t on the basis of a single observation, which is impossible. For this reason we are going to consider an important class of stationary process which exhibit con- siderable time-homogeneity and can be used to model phenomena approaching their equilibrium steady-state, but continuously undergoing random'func- tions. This is the class of stationary stochastic processes Definition A stochastic process (Xt, tET is said to be(strictly) stationary if any subset (t1, t2, . tr) of T and any T
As in the case of a normal random variable, the distribution of a normal stochastic process is characterized by the first two moment but now they are function of t. One problem so far in the definition of a stochastic process given above is much too general to enable us to obtain a operational probability model. In the analysis of stochastic process we only have a single realization of the process and we will have to deduce the value of µt and v(t) with the help of a single observation. (which is impossible !) The main purpose of the next three sections is to consider various special forms of stochastic process where we can construct probability models which are manageable in the context of statistical inference. Such manageability is achieved by imposing certain restrictions which enable us to reduce the number of unknown parameters involved in order to be able to deduce their value from a single realization. These restrictions come in two forms: 1. restriction on the time-heterogeneity of the process; and 2. restriction on the memory of the process. 1.2 Restricting the time-heterogeneity of a stochastic process For an arbitrary stochastic process {Xt ,t ∈ T } the distribution function F(Xt ; θt) depends on t with the parameter θt characterizing it being function of t as well. That is, a stochastic process is time-heterogeneous in general. This, however, raises very difficult issues in modeling real phenomena because usually we only have one observation for each t. Hence in practice we will have to estimate θt on the basis of a single observation, which is impossible. For this reason we are going to consider an important class of stationary process which exhibit considerable time-homogeneity and can be used to model phenomena approaching their equilibrium steady − state, but continuously undergoing ’random’ functions. This is the class of stationary stochastic processes. Definition: A stochastic process {Xt ,t ∈ T } is said to be (strictly) stationary if any subset (t1,t2, ...,tT ) of T and any τ , 4
F(X1,…,Xt)=F(X1+x,…,Xx+) That is, the distribution of the process remains unchanged when shifted in time by an arbitrary value T. In terms of the marginal distributions, (strictly) stationarity implies that F(X)=F(Xt+),t∈T, and hence F(Xt)=F(Xt)=.= F(Xtr). That is stationarity implies that Xt, Xt2,. Xtr are(individually) identically distributed The concept of stationarity, although very useful in the context of probability theory, is very difficult to verify in practice because it is defined in terms of dis- tribution function. For this reason the concept of the second order stationarity defined in terms of the first two moments, is commonly preferred Definitie A stochastic process IXt, tET is said to be(weakly) stationary if E(Xt= u for all t u(ti,ti)=E((X, -u)(X, -p)=mt -t, ti, t,ET These suggest that weakly stationarity for IXt, tET implies that its mean and variance v(ti)=no are constant and free of t and its autocovariance depends on the interval It,-til; not t; and t Example Consider the normal stochastic process in the above example. With the weakly stationarity assumption, now ur= e(Xr) a sizeable reduction in the number of unknown parameters from T+T(T+1)/2 to(T+1). It is important, however, to note that even in the case of stationarity
F(Xt1 , ..., XtT ) = F(Xt1+τ , ..., XtT +τ ). That is, the distribution of the process remains unchanged when shifted in time by an arbitrary value τ . In terms of the marginal distributions, (strictly) stationarity implies that F(Xt) = F(Xt+τ ), t ∈ T , and hence F(Xt1 ) = F(Xt2 ) = ... = F(XtT ). That is stationarity implies that Xt1 , Xt2 , ..., XtT are (individually) identically distributed. The concept of stationarity, although very useful in the context of probability theory, is very difficult to verify in practice because it is defined in terms of distribution function. For this reason the concept of the second order stationarity, defined in terms of the first two moments, is commonly preferred. Definition: A stochastic process {Xt ,t ∈ T } is said to be (weakly) stationary if E(Xt) = µ for all t; v(ti ,tj) = E[(Xti − µ)(Xtj − µ)] = γ|tj−ti| , ti ,tj ∈ T . These suggest that weakly stationarity for {Xt ,t ∈ T } implies that its mean and variance v 2 (ti) = γ0 are constant and free of t and its autocovariance depends on the interval |tj − ti |; not ti and tj . Example: Consider the normal stochastic process in the above example. With the weakly stationarity assumption, now µT = E(XT ) = µ µ . . . µ VT = γ0 γ1 . . . γT −1 γ1 γ0 . . . γT −2 . . . . . . . . . . . . . . . . . . γT −1 . . . . γ0 , a sizeable reduction in the number of unknown parameters from T +[T(T +1)/2] to (T + 1). It is important, however, to note that even in the case of stationarity 5
the number of parameters increase with the size of the subset(t1, . tr) although the parameters do not depend on t E T. This is because time-homogeneity does not restrict the 'memory'of the process. In the next section we are going to consider 'memory'restrictions in an obvious attempt to ' solve' the problem of the parameters increasing with the size of the subset(t1, t2, . tr)of T 1.3 Restricting the memory of a stochastic process n the case of a typical economic times series, viewed as a particular realiza- tion of a stochastic process ( Xt, tET one would expect that the dependence between X, and X,, would tend to weaken as the distance(ti-ti)increase Formally, this dependence can be described in terms of the joint distribution F(Xn)=F(Xt=.= F(Xtr) as follows Definition asymptotically independent Definitio asymptotically uncorrelated Definition trongly mixing Definition uniformly mixing Definition ergodic 1. 4 Some special stochastic process We will consider briefly several special stochastic process which play an impor tant role in econometric modeling. These stochastic processes will be divided into parametric and non-parametric process. The non-parametric process are de- fined in terms of their joint distribution function or the first few joint moments
the number of parameters increase with the size of the subset (t1, ...,tT ) although the parameters do not depend on t ∈ T . This is because time-homogeneity does not restrict the ’memory’ of the process. In the next section we are going to consider ’memory’ restrictions in an obvious attempt to ’solve’ the problem of the parameters increasing with the size of the subset (t1,t2, ...,tT ) of T . 1.3 Restricting the memory of a stochastic process In the case of a typical economic times series, viewed as a particular realization of a stochastic process {Xt , t ∈ T } one would expect that the dependence between Xti and Xtj would tend to weaken as the distance (tj − ti) increase. Formally, this dependence can be described in terms of the joint distribution F(Xt1 ) = F(Xt2 ) = ... = F(XtT ) as follows: Definition: asymptotically independent Definition: asymptotically uncorrelated Definition: strongly mixing Definition: uniformly mixing Definition: ergodic. 1.4 Some special stochastic process We will consider briefly several special stochastic process which play an important role in econometric modeling. These stochastic processes will be divided into parametric and non-parametric process. The non-parametric process are de- fined in terms of their joint distribution function or the first few joint moments. 6
On the other hand, parametric process are defined in terms of a generating mech anism which is commonly a functional form based on a non-parametric process 1.4.1 Non-Parametric process Definition A stochastic process Xt, tET is said to be a white-noise process if (i). E(Xt) (ii E(X,X) g if t=T 0ift≠T Hence, a white-noise process is both time-homogeneous, in view of the fact that it is a weakly-stationary process, and has no memory. In the case where Xt, tETI is also assumed to be normal the process is also strictly stationary Definition A stochastic process Xt, tET is said to be a martingales process if Definition A stochastic process (Xt, tET is said to be an innovation process if Definition: A stochastic process ( Xt, tET is said to be a Markov process if Definition A stochastic process iXt, tET is said to be a Brownian motion process if 1.4.2 Parametric stochastic processes Definition A stochastic process (Xt, tET is said to be a autoregressive of order one (AR(I)) if it satisfies the stochastic difference equation Xt=xt-1+u where o is a constant and ut is a white-noise process
On the other hand, parametric process are defined in terms of a generating mechanism which is commonly a functional form based on a non-parametric process. 1.4.1 Non-Parametric process Definition: A stochastic process {Xt , t ∈ T } is said to be a white-noise process if (i). E(Xt) = 0; (ii). E(XtXτ ) = σ 2 if t = τ 0 if t 6= τ. Hence, a white-noise process is both time-homogeneous, in view of the fact that it is a weakly-stationary process, and has no memory. In the case where {Xt , t ∈ T } is also assumed to be normal the process is also strictly stationary. Definition: A stochastic process {Xt , t ∈ T } is said to be a martingales process if... Definition: A stochastic process {Xt , t ∈ T } is said to be an innovation process if.... Definition: A stochastic process {Xt , t ∈ T } is said to be a Markov process if.... Definition: A stochastic process {Xt , t ∈ T } is said to be a Brownian motion process if... 1.4.2 Parametric stochastic processes Definition: A stochastic process {Xt , t ∈ T } is said to be a autoregressive of order one (AR(1)) if it satisfies the stochastic difference equation, Xt = φXt−1 + ut where φ is a constant and ut is a white-noise process. 7
We first consider the index set t*={0,±1,±2,…} and assume that X-r→0 asT→∞. Define a lag- operator by LXt≡Xt-1, then the ar(1)process can be rewritten as (1-oLXt ut or when o|< 1, X=(1-6D)-m=(1+oL+2L2+…)ut=u+out-1+2au1-2+ ∑ from which we can deduce that E(XL) E(X, Xi+r)=E p'o Hence, for lo< 1, the stochastic process Xt, tET) is both weakly-stationary and asymptotically uncorrelated since the autocovariance function (r) o→0,asr→∞ Therefore, if any finite subset of T*, say t1, t2, . tr of a ar(1)process, (Xt, Xt,, XtT)= Xr has covariance matrix E(XTXT=O (1-2
We first consider the index set T ∗ = {0, ±1, ±2, ...} and assume that X−T → 0 as T → ∞. Define a lag − operatorL by LXt ≡ Xt−1, then the AR(1) process can be rewritten as (1 − φL)Xt = ut or when |φ| < 1, Xt = (1 − φL) −1ut = (1 + φL + φ 2L 2 + ....)ut = ut + φut−1 + φ 2ut−2 + ..... = X∞ i=0 φ iut−i , from which we can deduce that E(Xt) = 0, E(XtXt+τ ) = E ( X∞ i=0 φ iut−i ! X∞ j=0 φ iut+τ−j !) = σ 2 u X∞ i=0 φ iφ i+τ ! = σ 2 uφ τ X∞ i=0 φ 2i ! , τ ≥ 0. Hence, for |φ| < 1, the stochastic process {Xt , t ∈ T ∗} is both weakly-stationary and asymptotically uncorrelated since the autocovariance function v(τ ) = σ 2 u (1 − φ 2 ) φ τ → 0, as τ → ∞. Therefore, if any finite subset of T ∗ , say t1,t2, ...,tT of a AR(1) process, (Xt1 , Xt2 , ..., XtT ) ≡ X0 T has covariance matrix E(XTX0 T) = σ 2 u 1 (1 − φ 2 ) 1 φ . . . φ T −1 φ 1 φ . . φ T −2 . . . . . . . . . . . . . . . . . . φ T −1 . . . . 1 = σ 2 uΩ, where Ω = 1 (1 − φ 2 ) 1 φ . . . φ T −1 φ 1 φ . . φ T −2 . . . . . . . . . . . . . . . . . . φ T −1 . . . . 1 . 8
It is straightforward to show by direct multiplication that PP=9-1 0 10 0 Ar(p) process Definition: MA(I)p rocess Definition MA(q) process Definition ARMA(p, q process Definition: ARIMA(p, d, q) process Definition ARFIMA(p, d, g) process 2 Testing for Autocorrelation Most of the available tests for autocorrelation are based on the principle that if the true disturbance Et are autocorrelated, this fact will be revealed through the autocorrelation of the Ols residuals et, i.e yt=x B+Et=X6+et
It is straightforward to show by direct multiplication that P 0P = Ω −1 , for P = p 1 − φ 2 0 . . . 0 −φ 1 0 . . 0 0 −φ 1 0 . 0 . . . . . . . . . . . . 0 0 . . −φ 1 . Definition: AR(p) process. Definition: MA(1) process. Definition: MA(q) process. Definition: ARMA(p,q) process. Definition: ARIMA(p,d,q) process. Definition: ARFIMA(p,d,q) process. 2 Testing for Autocorrelation Most of the available tests for autocorrelation are based on the principle that if the true disturbance εt are autocorrelated, this fact will be revealed through the autocorrelation of the OLS residuals et , i.e. yt = x 0 tβ + εt = x 0 tβˆ + et . 9
2.1 The durbin-Watson Test The most extensively used for AR(1)disturbance id the Durbin-Watson test de- veloped by Durbin and Watson(1950, 1951 Let z and v be Txl random vector such that z= Mv, where M=I-(X'x)-X and X is a T x k nonstochastic matrix of rank k. Furthermore. let r=zAz/z'z where a is a real symmetric matrix. Then (1). There exists an orthogonal transformation v= Ho such that 02 ∑ where u1, u2,,uT-k are the T-k nonzero(ordered) eigenvalues of MA, the rest being zero and di NN(o, 1).(That is, ui is function of X. Therefore the distribution of r is unknown. (2). If s of the columns of X are linear combinations of s of the eigenvectors of A and if the eigenvalues of A associated with the remaining T-s eigenvalues of a are renumbered so that A1<A2≤…≤Ar-s, then 入2<1≤A (=1,2,…,T-k) From the above lemma the following corollary can be deduc Corollary r≤rU, where
2.1 The Durbin-Watson Test The most extensively used for AR(1) disturbance id the Durbin-Watson test developed by Durbin and Watson (1950,1951). Lemma: Let z and v be T×1 random vector such that z = Mv, where M = I − (X0X) −1X0 and X is a T × k nonstochastic matrix of rank k. Furthermore, let r = z 0Az/z 0z, where A is a real symmetric matrix. Then (1). There exists an orthogonal transformation v = Hδ such that r = PT −k i=1 uiδ 2 P i T −k i=1 δ 2 i , where u1, u2, ..., uT −k are the T − k nonzero (ordered) eigenvalues of MA, the rest being zero and δi ∼ N(0, 1). (That is, ui is function of X. Therefore the distribution of r is unknown.) (2). If s of the columns of X are linear combinations of s of the eigenvectors of A and if the eigenvalues of A associated with the remaining T − s eigenvalues of A are renumbered so that λ1 ≤ λ2 ≤ ... ≤ λT −s, then λi ≤ ui ≤ λi+k−s (i = 1, 2, ..., T − k). From the above lemma the following corollary can be deduced: Corollary: rL ≤ r ≤ rU , where rL = PT −k i=1 λiδ 2 P i T −k i=1 δ 2 i , and rU = PT −k i=1 λi+k−sδ 2 P i T −k i=1 δ 2 i . 10