Ch. 23 Cointegration 1 Introduction An important property of (1) variables is that there can be linear combinations of theses variables that are I(O). If this is so then these variables are said to be cointegrated. Suppose that we consider two variables Yt and Xt that are I(1) (For example, Yt= Yt-1+ St and Xt= Xi-1+nt.)Then, Yt and Xt are said to be cointegrated if there exists a B such that Yt - BXt is I(O). What this mean is that the regression equation Yt= BX+ut make sense because Yt and Xt do not drift too far apart from each other over time. Thus, there is na long- run equilibrium relationship between them. If and Xt are not cointegrated, that is, Yt-BXt= ut is also I(1), then Yt and Xt would drift apart from each other over time. In this case the relationship between Yt and Xt that we obtain by regressing Yt and Xt would be spurious Let us continue the cointegration with the spurious regression setup in which Xt and Yt are independent random walks, consider what happens if we take a nontrivial linear combination of Xt and y alt+axT=a1Yt-1+a2Xt-1+a1St+a27t where a and ay are not both zero. We can write this as Zt= zt-1+U where Zt=aiT+a2 Xt and ut= a1St +a2nt. Thus, Zt is again a random walk process, as ut is i i.d. with mean zero and finite variance given that t and nt each are i.i.d. with mean zero and finite variance. No matter what coefficients a and a2 we choose, the resulting linear combination is again a random walk, hence an integrated or unit root process Now consider what happens when Xt is a random walk as before, but Yt is instead generated according to Yt= BX+ut, with ut again iid. By itself, Yt an integrated process, because Y-Y-1=(Xt-X1-1)3+t-u-1
Ch. 23 Cointegration 1 Introduction An important property of I(1) variables is that there can be linear combinations of theses variables that are I(0). If this is so then these variables are said to be cointegrated. Suppose that we consider two variables Yt and Xt that are I(1). (For example, Yt = Yt−1 + ζt and Xt = Xt−1 + ηt .) Then, Yt and Xt are said to be cointegrated if there exists a β such that Yt − βXt is I(0). What this mean is that the regression equation Yt = βXt + ut make sense because Yt and Xt do not drift too far apart from each other over time. Thus, there is na long-run equilibrium relationship between them. If Yt and Xt are not cointegrated, that is, Yt − βXt = ut is also I(1), then Yt and Xt would drift apart from each other over time. In this case the relationship between Yt and Xt that we obtain by regressing Yt and Xt would be spurious. Let us continue the cointegration with the spurious regression setup in which Xt and Yt are independent random walks, consider what happens if we take a nontrivial linear combination of Xt and Yt : a1Yt + a2Xt = a1Yt−1 + a2Xt−1 + a1ζt + a2ηt , where a1 and a2 are not both zero. We can write this as Zt = Zt−1 + vt , where Zt = a1Yt + a2Xt and vt = a1ζt + a2ηt . Thus, Zt is again a random walk process, as vt is i.i.d. with mean zero and finite variance, given that ζt and ηt each are i.i.d. with mean zero and finite variance. No matter what coefficients a1 and a2 we choose, the resulting linear combination is again a random walk, hence an integrated or unit root process. Now consider what happens when Xt is a random walk as before, but Yt is instead generated according to Yt = βXt + ut , with ut again i.i.d.. By itself, Yt is an integrated process, because Yt − Yt−1 = (Xt − Xt−1)β + ut − ut−1, 1
so that t-1 -1+Et where Et= Bnt +ut-ut-I is readily verified to be I(O)process Despite the fact that both Xt and Yt are integrated processes, the situation is very different from that considered at last chapter. Here, there is indeed a linear combination of Xt and Yt that are not an integrated process: putting a1= l and a2=-6 we have alT+a,Xt=Yt-BXt t which is i i d. This is an example of a pair (Xt, Yi of cointegrated process The concept of cointegration was introduced by Granger(1981). This pa per and that of Engle and Granger(1987) have had a major impact on modern econometrics. Following Engle and Granger(1987), we have the definition of cointegration formally as follows Definition 1 The components of the vector xt are said to be co-integrated of order d, b, de- noted xt N cI(d,b), if (a). all components of xt are I(d) (b). there exists a vector a( 0)so that at =a'xtNI(d-6),b>0. The vector a is called the co-integrating vector For ease of exposition, only the value d= l and b= l will be considered in this chapter. For the case that d and b are fractional value, this is called fractional cointegration. We will consider this case in Chapter 25 Clearly, the cointegrating vector a is not unique, for if a'xt is I(0), then so is ba'xt for any nonzero scalar b; if a is a cointegrating vector, then so is ba. If xt has h components, then there be more than one cointegrating vector a Indeed, there may be h k linear independent(k x 1) vectors(a1, a2 an )such
so that Yt = Yt−1 + βηt + ut − ut−1 = Yt−1 + εt , where εt = βηt + ut − ut−1 is readily verified to be I(0) process. Despite the fact that both Xt and Yt are integrated processes, the situation is very different from that considered at last chapter. Here, there is indeed a linear combination of Xt and Yt that are not an integrated process: putting a1 = 1 and a2 = −β we have a1Yt + a2Xt = Yt − βXt = ut , which is i.i.d. This is an example of a pair {Xt , Yt} of cointegrated process. The concept of cointegration was introduced by Granger (1981). This paper and that of Engle and Granger (1987) have had a major impact on modern econometrics. Following Engle and Granger (1987), we have the definition of cointegration formally as follows. Definition 1: The components of the vector xt are said to be co-integrated of order d, b, denoted xt ∼ CI(d, b), if (a). all components of xt are I(d); (b). there exists a vector a(6= 0) so that zt = a 0xt ∼ I(d − b), b > 0. The vector a is called the co-integrating vector. For ease of exposition, only the value d = 1 and b = 1 will be considered in this chapter. For the case that d and b are fractional value, this is called fractional cointegration. We will consider this case in Chapter 25. Clearly, the cointegrating vector a is not unique, for if a 0xt is I(0), then so is ba 0xt for any nonzero scalar b; if a is a cointegrating vector, then so is ba. If xt has k components, then there may be more than one cointegrating vector a. Indeed, there may be h < k linear independent (k×1) vectors (a1, a2,..., ah) such 2
that a'xt is a I(O)(h x 1) vector, where A' is the following (h x k)matrix Again, the vector(a1, a2,, an)are not unique; if A'xt is a I(O), then for any nonzero(1 x h) vector b, the scalar b'A'xt is also I(0). Then the(k x 1) vector given by b'a could also be described as a cointegrating vector Suppose that there exists an(h x k) matrix a whose rows are linearly inde- pendent such that Axt is a(h 1)I(0)vector. Suppose further that if c'is any (1 x k) vector that is linearly independent of the rows of A, then c'xt is a I(1) scalar. Then we say that there are exactly h cointegrating relations among the elements of xt and that(a1, a,,, an) form a basis for the space of the cointe- grating vectors Ex Let P denote an index of the price level in the United States, P* a price index for Italy, and St the exchange rate between the currency. Then purchasing power parity holds that Pt= S,P or. taking logarithms Pt=Stt pt where pt≡ log Pt,st≡ log St,andp≡logP. In equilibrium we need p St-Pt=0. However, in practice, error in measuring price, transportation costs, and differences in quality prevent purchasing power parity from holding exactly at every date t. A weaker form of the hypothesis is that the variable zt defined 2t =pt Pt I(O), even though the individual elements yt=(pt St p)are all I(1). In this case, we have a single cointegrating vector a=(1-1-1). The term at =ayu
that A0xt is a I(0) (h × 1) vector, where A0 is the following (h × k) matrix: A0 = a 0 1 a 0 2 . . . a 0 h . Again, the vector (a1 , a2,..., ah) are not unique; if A0xt is a I(0), then for any nonzero (1 × h) vector b 0 , the scalar b 0A0xt is also I(0). Then the (k × 1) vector φ given by φ 0 = b 0A0 could also be described as a cointegrating vector. Suppose that there exists an (h × k) matrix A0 whose rows are linearly independent such that A0xt is a (h × 1) I(0) vector. Suppose further that if c 0 is any (1 × k) vector that is linearly independent of the rows of A0 , then c 0xt is a I(1) scalar. Then we say that there are exactly h cointegrating relations among the elements of xt and that (a1, a2,..., ah) form a basis for the space of the cointegrating vectors. Example: Let Pt denote an index of the price level in the United States, P ∗ t a price index for Italy, and St the exchange rate between the currency. Then purchasing power parity holds that Pt = StP ∗ t , or, taking logarithms, pt = st + p ∗ t , where pt ≡ log Pt , st ≡ log St , and p ∗ t ≡ log P ∗ t . In equilibrium we need pt − st − p ∗ t = 0. However, in practice, error in measuring price, transportation costs, and differences in quality prevent purchasing power parity from holding exactly at every date t. A weaker form of the hypothesis is that the variable zt defined by zt = pt − st − p ∗ t is I(0), even though the individual elements yt = (pt st p ∗ t ) 0 are all I(1). In this case, we have a single cointegrating vector a = (1 −1 −1)0 . The term zt = a 0yt 3
is interpreted as the equilibrium error; although it is not always zero, but it can not be apart from zero too often and too far to make sense the equilibrium concept
is interpreted as the equilibrium error; although it is not always zero, but it can not be apart from zero too often and too far to make sense the equilibrium concept. 4
2 Granger Representation Theorem Let each elements of the(k x 1) vector, yt is I(1)with the(k x h)cointegrating matrix, A, such that each elements of Ayt is I(0). Then Granger(1983) have the following fundamental results when y are cointegrated 2.1 Implication of Cointegration For the VMA Represen- tation We now discuss the general implications of cointegration for the moving average representation of a vector system. Since it is assumed the Ayt is I(0, let d= (△yt), and def Suppose that ut has the Wold representation ut=Et+业1et-1+业2Et-1+….=业(L)et, where E(Et)=0 and E(EET) Q for t=T 0 otherwise Let y(1)denotes the(k x k) matrix polynomial y(a)evaluated at z= 1; that y(1)≡Ik+业1+业2+业 Then the following holds (a)A亚(1)=0 (b).A6=0. To verify this claim, note that as long as sys so is absolutely summabl the difference equation(1) implies that(from multivariate B-n decomposition) yt=yo+6·t+u1+u2+…+ut yo+0:t+(1)·(e1+e2+…+Et)+nt-7
2 Granger Representation Theorem Let each elements of the (k × 1) vector, yt is I(1) with the (k × h) cointegrating matrix, A, such that each elements of A0yt is I(0). Then Granger (1983) have the following fundamental results when y are cointegrated. 2.1 Implication of Cointegration For the VMA Representation We now discuss the general implications of cointegration for the moving average representation of a vector system. Since it is assumed the 4yt is I(0), let δ ≡ E(4yt), and define ut = 4yt − δ. (1) Suppose that ut has the Wold representation ut = εt + Ψ1εt−1 + Ψ2εt−1 + ... = Ψ(L)εt , where E(εt) = 0 and E(εtε 0 τ ) = Ω for t = τ 0 otherwise. Let Ψ(1) denotes the (k × k) matrix polynomial Ψ(z) evaluated at z = 1; that is, Ψ(1) ≡ Ik + Ψ1 + Ψ2 + Ψ3 + ..., Then the following holds. (a). A0Ψ(1) = 0, (b). A0δ = 0. To verify this claim, note that as long as {sΨs} ∞ s=0 is absolutely summable, the difference equation (1) implies that (from multivariate B-N decomposition): yt = y0 + δ · t + u1 + u2 + ... + ut (2) = y0 + δ · t + Ψ(1) · (ε1 + ε2 + ... + εt) + ηt − η0 , (3) 5
where n is a stationary process. Premultiplying 3) by Aresults in A'yt=A"(y0-m)+A·t+A业(1)(1+e2+…+e)+Am:~I(04) If E(Ee is nonsingular, then c(E1+E2+.+Et)is I(1)for every nonzero (k x 1)vector c. Moreover, if some of the series exhibit nonzero drift(8+0), the linear combination A'y, will grow deterministically at rate A'o. Thus if the underlying hypothesis suggesting the possibility of cointegration is that certain linear combination of yt are I(0), this require that both conditions that A'y(1) 0 and A's=0 hold The second condition means that despite the presence of a drift term in the process generating yt, there is no linear trend in the cointegrated combination See Banerjee et al (1993)p. 151 for details. To the implication of the first con- dition, from partitioned matrix production we have (1×k) a1业(1) 0 a2业(1) A业(1)= (k×k) ah(1×k) an业(1) 0 which implies v(1)i(xk) y(1)2(xk) av(1)=[ u12 ∑anv(1).=01x)fori=1,2,….,(5) s=1 v(1)×(x) where asi is the sth elements of the row vector a' and (1) is the i th row of the matrix y(1 Equation(5) implies that certain linear combination of the rows of y(1)are zero, meaning that the row vector of y (1)are linearly dependent. That is y (1)is a singular matrix, or equivalently, the determinant of y (1)are zero
where ηt is a stationary process. Premultiplying (3) by A0 results in A0yt = A0 (y0 − η0 ) + A0 δ · t + A0Ψ(1) · (ε1 + ε2 + ... + εt) + A0ηt ∼ I(0)(4). If E(εtε 0 t ) is nonsingular, then c 0 (ε1 + ε2 + ... + εt) is I(1) for every nonzero (k × 1) vector c. Moreover, if some of the series exhibit nonzero drift (δ 6= 0), the linear combination A0yt will grow deterministically at rate A0δ. Thus if the underlying hypothesis suggesting the possibility of cointegration is that certain linear combination of yt are I(0), this require that both conditions that A0Ψ(1) = 0 and A0δ = 0 hold. The second condition means that despite the presence of a drift term in the process generating yt , there is no linear trend in the cointegrated combination. See Banerjee et.al (1993) p. 151 for details. To the implication of the first condition, from partitioned matrix production we have A0Ψ(1) = a 0 1(1×k) a 0 2(1×k) . . . ah(1×k) · Ψ(1)(k×k) = a 0 1Ψ(1) a 0 2Ψ(1) . . . ahΨ(1) = 0 0 . . . 0 , which implies a 0 iΨ(1) = a1i a2i . . . aki ψ(1)0 1(1×k) ψ(1)0 2(1×k) . . . ψ(1)0 k(1×k) = X k s=1 asiψ(1)0 s = 0(1×k) for i = 1, 2, ..., k, (5) where asi is the sth elements of the row vector a 0 i and ψ(1)0 i is the i th row of the matrix Ψ(1) Equation (5) implies that certain linear combination of the rows of Ψ(1) are zero, meaning that the row vector of Ψ(1) are linearly dependent. That is, Ψ(1) is a singular matrix, or equivalently, the determinant of Ψ(1) are zero, 6
i.e. 4(1)=0. This in turn means that the matrix operator y(L)is non- invertible. Thus, a cointegrated system can never be represented by a finite-order vector autoregression in the differenced data Ayt from the non-invertibility of y(l) of the following equations △yt-6=业(L)et 2.2 Implication of Cointegration For the VAR Represen- tation Suppose that the level of t can be represented as a non-stationary pth-order vector autoregression:3 yt=c+由yt-1+中2yt-2+…+中pyt-p+Et, 更(DJyt=c+ 重(L)≡[Lk-重1L一重2L 更D] Suppose that Ay has the Wold representation D)yt=0+业(L)et Premultiplying(8)by (L)results in (1-L)重(L)y=重(1)6+重(L)业(L)Et I Recall from Theorem 4 on page 7 of Chapter 22, this condition violate the proof of spurious If the determinant of an(n x n) matrix H is not equal zero, its inverse is found by dividing the adjoint by the determinant: H-=(1/H[-1)'+Hiill 3The is not the only model for I(1). See Saikkonen and Luukkonen(1997)infinite VAR and VARMA model
i.e. |Ψ(1)| = 0. 1 This in turn means that the matrix operator Ψ(L) is noninvertible. 2 Thus, a cointegrated system can never be represented by a finite-order vector autoregression in the differenced data 4yt from the non-invertibility of Ψ(L) of the following equations: 4yt − δ = Ψ(L)εt . 2.2 Implication of Cointegration For the VAR Representation Suppose that the level of yt can be represented as a non-stationary pth-order vector autoregression: 3 yt = c + Φ1yt−1 + Φ2yt−2 + ... + Φpyt−p + εt , (6) or Φ(L)yt = c + εt . (7) where Φ(L) ≡ [Ik − Φ1L − Φ2L 2 − ... − ΦpL p ]. Suppose that 4yt has the Wold representation (1 − L)yt = δ + Ψ(L)εt . (8) Premultiplying (8) by Φ(L) results in (1 − L)Φ(L)yt = Φ(1)δ + Φ(L)Ψ(L)εt . (9) 1Recall from Theorem 4 on page 7 of Chapter 22, this condition violate the proof of spurious regression 2 If the determinant of an (n × n) matrix H is not equal zero, its inverse is found by dividing the adjoint by the determinant: H−1 = (1/|H)| · [(−1)i+j |Hji |]. 3The is not the only model for I(1). See Saikkonen and Luukkonen (1997) infinite VAR and ? VARMA model 7
Substituting(7) into(9), we have (1-D)et=重(1)6+重(L业(L)et since(1-L)c=0. Now, equation(10) has to hold for all realizations of Et, which y (1)8=0(a vector) and that(1-LIk and (LY(L) represent the identical polynomials in L. In particular, for L= 1, equation(10) implies that 重(1)亚(1)=0.( a matrix) (12) Let i denote ith row of p(1). Then(11)and(12)state that p y(1)=0(arow of zero) and 8=0(a zero scalar). Recalling conditions(a)and(b)of section 2. 1, this mean that i is a cointegrating vector. If al, a,, an form a basis for the space of cointegrating vectors, then it must be possible to express i as a linear combination of a1, a2, ah-that is, there exist an(h x 1) vector b; such that abi or that 1=b2A for a'the(h x k) matrix whose ith row is a!. Applying this reasoning to each of the rows of更(1),ie. 中1 b,A 2 b2A BA b where b is an k x h matrix. However it is seen that the matrix a and b is not identified since for any choice of h x h matrix m, the matrix p(1)=BrmA B'A"implies the same distribution with p (1)=ba. What can be determined
Substituting (7) into (9), we have (1 − L)εt = Φ(1)δ + Φ(L)Ψ(L)εt , (10) since (1−L)c = 0. Now, equation (10) has to hold for all realizations of εt , which require that Ψ(1)δ = 0 (a vector) (11) and that (1 − L)Ik and Φ(L)Ψ(L) represent the identical polynomials in L. In particular, for L = 1, equation (10) implies that Φ(1)Ψ(1) = 0. (a matrix) (12) Let φ 0 i denote ith row of Φ(1). Then (11) and (12) state that φ 0 iΨ(1) = 0 0 (a row of zero) and φ 0 iδ = 0 (a zero scalar). Recalling conditions (a) and (b)of section 2.1, this mean that φi is a cointegrating vector. If a1, a2,..., ah form a basis for the space of cointegrating vectors, then it must be possible to express φi as a linear combination of a1, a2,..., ah–that is, there exist an (h × 1) vector bi such that φi = [a1 a2 .... ah]bi or that φ 0 i = b 0 iA0 for A0 the (h × k) matrix whose ith row is a 0 i . Applying this reasoning to each of the rows of Φ(1), i.e. Φ(1) = φ 0 1 φ 0 2 . . . φ 0 k = b 0 1A0 b 0 2A0 . . . b 0 kA0 = BA0 , (13) where B is an k × h matrix. However, it is seen that the matrix A and B is not identified since for any choice of h×h matrix Υ, the matrix Φ(1) = BΥ−1ΥA0 = B∗A∗0 implies the same distribution with Φ(1) = BA0 . What can be determined 8
is the space spanned by a the cointegrating space which need the concept of the Note that(13)implies that the k x k matrix p (1) is a singular matrix because rank( (1))=rank(Ba)< min(rank (B), rank(A))=h< k 2.3 Vector Error Correction Representation A final representation for a cointegrated system is obtained by recalling from equation(1) of Chapter 22 that any V AR (not necessary cointegrated at this stage)in the form of (6) can be equivalently be written as △yt=51△y-1+2△y-2+…+5p-1yt-p+1+c+50y-1+Et,(14) e p)=-更(1) ote that if yt has h cointegrating relations, then substitution of(13) (14)results in △y=51△y-1+2△y-2+…+p-1△y-p+1+c-BAy-1+E,(15) Denote zt= A'yt, noticing that zt is a stationary(h x 1) vector. Then(15) can be written as △yt=51△y-1+2y-2+…+5n-1△yt-p+1+c-Bz-1+Et(16 Expression(16)is known as the vector error-correction representation of the cointegrated system. It is interesting to see that while a cointegrated system
is the space spanned by A the cointegrating space which need the concept of the basis. Note that (13) implies that the k×k matrix Φ(1) is a singular matrix because rank(Φ(1)) = rank(BA0 ) ≤ min(rank(B), rank(A0 )) = h < k. 2.3 Vector Error Correction Representation A final representation for a cointegrated system is obtained by recalling from equation (1) of Chapter 22 that any V AR (not necessary cointegrated at this stage) in the form of (6) can be equivalently be written as 4yt = ξ14yt−1 + ξ24yt−2 + ... + ξp−1yt−p+1 + c + ξ0yt−1 + εt , (14) where ξ0 ≡ ρ − I = −(I − Φ1 − Φ2 − ... − Φp) = −Φ(1). Note that if yt has h cointegrating relations, then substitution of (13) into (14) results in 4yt = ξ14yt−1 + ξ24yt−2 + ... + ξp−14yt−p+1 + c − BA0yt−1 + εt , (15) Denote zt ≡ A0yt , noticing that zt is a stationary (h × 1) vector. Then (15) can be written as 4yt = ξ14yt−1 + ξ24yt−2 + ... + ξp−14yt−p+1 + c − Bzt−1 + εt . (16) Expression (16) is known as the vector error-correction representation of the cointegrated system. It is interesting to see that while a cointegrated system 9
can never be represented by a finite-order vector autoregression in the differenced data Ayt, it has a vector error correction representation; the difference is in that the former has ignored the error correction term, -Bzt-1 Example: Let the individual elements (pt St Pi)are all I(1)and have a single cointegrating vector a=(1 1) among them. Then these three variables has a VECM representation △Pt (1)c(1)c(1) △pt-1 △pt-2 aSt △st-2+ △p △p △p △P-p (P-1) △st-p+1 △p 1-1-1 from which we see that the dynamics of changes in each variable is not only according to the lags of its own and other variable's change but also to the levels of each of the elements of zt-1 by the speed B P2-1+△-1+3△r1+52△P1-2+5B△s1=2+32△p +…+m-1 p-1)△st-p+1 +13△p-p+1+cp b1(P2-1-8t-1-p2-1)+ 出△-1+(1)△s-1+532△1+出2△P1=2+5B2△1=2+53△2 +…+-1)△p2 -p+1+{p-1) -bzt +e(p) From economics equilibrium, when there is a positive equilibrium error hap- pen in previous period, i.e. 2t-1=Pt-1-St-Pt-1>0, at time t, the changes in pt, i.e. Apt=Pt-Pt-l should be negatively related with this equilibrium error Therefore, the parameters of equilibrium error adjustment should be positive in
can never be represented by a finite-order vector autoregression in the differenced data 4yt , it has a vector error correction representation; the difference is in that the former has ignored the error correction term, −Bzt−1. Example: Let the individual elements (pt st p ∗ t ) 0 are all I(1) and have a single cointegrating vector a = (1 − 1 − 1)0 among them. Then these three variables has a V ECM representation: 4pt 4st 4p ∗ t = ξ (1) 11 ξ (1) 12 ξ (1) 13 ξ (1) 21 ξ (1) 22 ξ (1) 23 ξ (1) 31 ξ (1) 32 ξ (1) 33 4pt−1 4st−1 4p ∗ t−1 + ξ (2) 11 ξ (2) 12 ξ (2) 13 ξ (2) 21 ξ (2) 22 ξ (2) 23 ξ (2) 31 ξ (2) 32 ξ (2) 33 4pt−2 4st−2 4p ∗ t−2 + ... + ξ (p−1) 11 ξ (p−1) 12 ξ (p−1) 13 ξ (p−1) 21 ξ (p−1) 22 ξ (p−1) 23 ξ (p−1) 31 ξ (p−1) 32 ξ (p−1) 33 4pt−p+1 4st−p+1 4p ∗ t−p+1 + cp cs cp ∗ − b1 b2 b3 1 −1 −1 pt−1 st−1 p ∗ t−1 + ε (p) t ε (s) t ε (p∗) t , from which we see that the dynamics of changes in each variable is not only according to the lags of its own and other variable’s change but also to the levels of each of the elements of zt−1 by the speed B: 4pt = ξ (1) 11 4pt−1 + ξ (1) 12 4st−1 + ξ (1) 13 4p ∗ t−1 + ξ (2) 11 4pt−2 + ξ (2) 12 4st−2 + ξ (2) 13 4p ∗ t−2 +... + ξ (p−1) 11 4pt−p+1 + ξ (p−1) 12 4st−p+1 + ξ (p−1) 13 4p ∗ t−p+1 + cp −b1(pt−1 − st−1 − p ∗ t−1 ) + ε (p) t = ξ (1) 11 4pt−1 + ξ (1) 12 4st−1 + ξ (1) 13 4p ∗ t−1 + ξ (2) 11 4pt−2 + ξ (2) 12 4st−2 + ξ (2) 13 4p ∗ t−2 +... + ξ (p−1) 11 4pt−p+1 + ξ (p−1) 12 4st−p+1 + ξ (p−1) 13 4p ∗ t−p+1 + cp −b1zt−1 + ε (p) t . From economics equilibrium, when there is a positive equilibrium error happen in previous period, i.e. zt−1 = pt−1 − st − p ∗ t−1 > 0, at time t, the changes in pt , i.e. 4pt = pt − pt−1 should be negatively related with this equilibrium error. Therefore, the parameters of equilibrium error adjustment should be positive in (16). 10