Dayu Wu Applied Statistics Lecture Notes 1 Revision 1.E(∑)=∑Ex 2.Var(x+y)=Var(r)+Var(y)+2Cou(z,y) 3.Cou(ar +biy,azx+b2y)=a2Cou(ax+biy,)b2Cou(az+biy,y) 4.Hypothesis Test:Ho,H,p-value 5.If r~N(u,o2),then N(0,1)and (1-a)CI is (us+s) 2 Linear Regression Basic Concepts 1.=f(x)+ 2.Gauss-Markov: ·E()=0 ·Var(e)=a2 ·Cov(e,号)=0 3.Regrssion Model:E(ylr)=Bo+B,y=(y:...,v)T,r (r1,...,n)T 4.=-3-月x 5.e4=张-风-月 6.∑(a-)0=0 Estimation:OLSE l.Loss Function:Q=∑e=∑(--3x)月 2.First Order Condition:器=0andg器=0 This-2∑(-%-月)=0and-2∑x,(--3x)=0 Equally:∑e:=0and∑x:e=0 3.Center ():=o+ 4.成=7-月元 5.a-盘=等-器=学器 l of6
Dayu Wu Applied Statistics Lecture Notes 1 Revision 1. E( Pxi) = PExi 2. V ar(x + y) = V ar(x) + V ar(y) + 2Cov(x, y) 3. Cov(a1x + b1y, a2x + b2y) = a2Cov(a1x + b1y, x) + b2Cov(a1x + b1y, y) 4. Hypothesis Test: H0, H1, p-value 5. If x ∼ N(µ, σ2 ), then x−µ σ ∼ N(0, 1) and (1 − α) CI is µ − σZ α 2 , µ + σZ α 2 2 Linear Regression Basic Concepts 1. yi = f(xi) + ϵi 2. Gauss-Markov: • E(ϵi) = 0 • V ar(ϵi) = σ 2 • Cov(ϵi , ϵj ) = 0 3. Regrssion Model: E(y|x) = β0 + β1x, y = (y1, . . . , yn) T , x = (x1, . . . , xn) T 4. ϵi = yi − β0 − β1xi 5. ei = yi − βb0 − βb1xi 6. P(xi − x¯)¯y = 0 Estimation: OLSE 1. Loss Function: Q = Pe 2 i = P(yi − β0 − β1xi) 2 2. First Order Condition: ∂Q ∂β0 = 0 and ∂Q ∂β1 = 0 Thus −2 P(yi − β0 − β1xi) = 0 and −2 Pxi(yi − β0 − β1xi) = 0 Equally: Pei = 0 and Pxiei = 0 3. Center (¯y, x¯): y¯ = βb0 + βb1x¯ 4. βb0 = ¯y − βb1x¯ 5. βb1 = Lxy Lxx = P( Pxi−x¯)(yi−y¯) (xi−x¯) 2 = P P (xi−x¯)yi (xi−x¯) 2 = P Pxiyi−nx¯y¯ x 2 i −n(¯x) 2 1 of 6
Dayu Wu Applied Statistics Lecture Notes Estimation:MLE 1.Assumption:~N(0,02) 2.Probability Density Fction::fx回=点oe☆t-w 3.Likelihood Function: L(3,1,o2)=Πj=(2o2ep{-a∑(-%-3x)2} 4.Log Likelihood Function: 1ogL(8,月,a2)=-号1og2ma2)-六∑(--月)2 5.First Order Condition:=0 6.2=∑(-%-3x)2=1∑号 Estimation:B 1.高saliear combination of:a=二器=∑h 2.Unbiased:Ea=∑E=∑(风+Ar)=a 3.Vam(a)=∑()Varw)==品 4.BN(3,) Estimation:Bo 1.属is a linear combination of张:a=可-a元=A∑h-i∑2-r 2.Unbiased:E8o=E(B)=o 3.Cou(,a)=Cou(∑h,∑=州)=H∑Var()=0 4.Cou(vi:j)Cou(ei;j)=0 5.Var()=Var(g-a到=Var)+2Var(a)-2xCou(,a)=(月+器) 6.属~N(,((日+)) 7.Cou(,高)=Cou(面-a,a)=Cou(,a)-Var(©=-是o 2 of 6
Dayu Wu Applied Statistics Lecture Notes Estimation: MLE 1. Assumption: ϵi ∼ N(0, σ2 ) 2. Probability Density Function: fX(x) = √ 1 2πσ e 1 2σ2 (x−µ) 2 3. Likelihood Function: L(β0, β1, σ2 ) = Πfyi = (2πσ2 ) − n 2 exp{− 1 2σ2 P(yi − β0 − β1xi) 2} 4. Log Likelihood Function: log L(β0, β1, σ2 ) = − n 2 log(2πσ2 ) − 1 2σ2 P(yi − β0 − β1xi) 2 5. First Order Condition: ∂ log L ∂σ2 = − n 2σ2 + 1 2σ4 P(yi − β0 − β1xi) 2 = 0 6. bσ 2 = 1 n P(yi − β0 − β1xi) 2 = 1 n Pe 2 i Estimation: βb1 1. βb1 is a linear combination of yi : βb1 = P P (xi−x¯)yi (xi−x¯) 2 = P Pxi−x¯ (xi−x¯) 2 yi 2. Unbiased: Eβb1 = P Pxi−x¯ (xi−x¯) 2Eyi = P Pxi−x¯ (xi−x¯) 2 (β0 + β1xi) = β1 3. V ar(βb1) = P Pxi−x¯ (xi−x¯) 2 2 V ar(yi) = σ 2 P(xi−x¯) 2 = σ 2 Lxx 4. βb1 ∼ N(β1, σ 2 Lxx ) Estimation: βb0 1. βb0 is a linear combination of yi : βb0 = ¯y − βb1x¯ = 1 n Pyi − x¯ P Pxi−x¯ (xi−x¯) 2 yi 2. Unbiased: Eβb0 = E(¯y − β1x¯) = β0 3. Cov(¯y, βb1) = Cov( 1 n Pyi , P Pxi−x¯ (xi−x¯) 2 yi) = 1 n P Pxi−x¯ (xi−x¯) 2 V ar(yi) = 0 4. Cov(yi , yj ) = Cov(ϵi , ϵj ) = 0 5. V ar(βb0) = V ar(¯y−βb1x¯) = V ar(¯y)+¯x 2V ar(βb1)−2¯xCov(¯y, βb1) = 1 n + x¯ 2 Lxx σ 2 6. βb0 ∼ N(β0, 1 n + x¯ 2 Lxx σ 2 ) 7. Cov(βb0, βb1) = Cov(¯y − x¯βb1, βb1) = Cov(¯y, βb1) − xV ar ¯ (βb) = − x¯ Lxx σ 2 2 of 6
Dayu Wu Applied Statistics Lecture Notes Estimation: 1.=属+高x 2.Unbiased:Ey=8o+Br=Ey 3.Var(⑦=Var()+x2Var(a)+2rCom(a,a)=(日+=)o2 4.可~N(6+红,((日+)2) T test 1.o:房=0,房~N0,) 2.产-2∑e-点∑(- 3t=商= F test 1.SST=-∑(h-}2-w 2.SSR=∑(@-}2=~X 3.SSE=∑(-)2=(n-2)2~X2-2 4.SST SSR+SSE 5.F=胎可心Fn-2 Correlation 上=点=aV层 2P=器=是 3.R2=1- 4.t=g心tn- Residual 1.Ee=0 2.Var(e)=Var-)=(1--)2 3.Leverage:h 3of6
Dayu Wu Applied Statistics Lecture Notes Estimation: yb 1. yb = βb0 + βb1x 2. Unbiased: Eyb = β0 + β1x = Ey 3. V ar(yb) = V ar(βb0) + x 2V ar(βb1) + 2xCov(βb1, βb0) = 1 n + (x−x¯) 2 Lxx σ 2 4. yb ∼ N(β0 + β1x, 1 n + (x−x¯) 2 Lxx σ 2 ) T test 1. H0 : β1 = 0, βb1 ∼ N(0, σ 2 Lxx ) 2. σb 2 = 1 n−2 Pe 2 i = 1 n−2 P(yi − ybi) 2 3. t = βb √ 1 σb2/Lxx = βb1 √ Lxx σb F test 1. SST = P(yi − y¯) 2 = Lyy 2. SSR = P(ybi − y¯) 2 = L2 xy Lxx ∼ χ 2 1 3. SSE = P(yi − ybi) 2 = (n − 2)σb 2 ∼ χ 2 n−2 4. SST = SSR + SSE 5. F = SSR/1 SSE/(n−2) ∼ F1,n−2 Correlation 1. r = √ Lxy LxxLyy = βb1 qLxx Lyy 2. r 2 = SSR SST = L2 xy LxxLyy 3. R2 = 1 − SSE SST 4. t = √ n−2r √ 1−r 2 ∼ tn−2 Residual 1. Eei = 0 2. V ar(ei) = V ar(yi − ybi) = 1 − 1 n − (x−x¯) 2 Lxx σ 2 3. Leverage: hi = 1 n + (x−x¯) 2 Lxx 3 of 6
Dayu Wu Applied Statistics Lecture Notes Confidential Interval 1.高~N(,) 2.高~N(风,(日+是) 3.可~(%+4,((任+)2) 4 or-品vt sP(院sa-2列=1-a Them1-a叫ICis(间-2g,月+2) 3 Multiple Regression Basic Concepts 11 111·· 21… u 2.Y=x9+e 3.6~N(0,σ2),id 4.e N(0,aIn) Estimation:OLS 1.Note that:Xe=0. Thus XT(Y-X3)=0. We have XTY=XTX3. When det(xrX)≠0,a=(xX)-lxrY. 2.=X3=XXTX)-XTY=HY 3.Hat matrix:H2=H tr(H)=ha=p+1 4.e=Y-亚=(In-0Y 5.Cou(e,e)=a2(In-H),Var(ei)=a2(1-ha) 6.=∑g 4 of6
Dayu Wu Applied Statistics Lecture Notes Confidential Interval 1. βb1 ∼ N(β1, σ 2 Lxx ) 2. βb0 ∼ N(β0, 1 n + x¯ 2 Lxx σ 2 ) 3. yb ∼ N(β0 + β1x, 1 n + (x−x¯) 2 Lxx σ 2 ) 4. For example: t = βb √1−β1 σb2/Lxx ∼ tn−2 Thus P βb √1−β1 σb2/Lxx ≤ t α 2 (n − 2) = 1 − α Then (1 − α) IC is βb1 − √ σb Lxx t α 2 , βb1 + √ σb Lxx t α 2 3 Multiple Regression Basic Concepts 1. y1 y2 . . . yn = 1 x11 . . . x1p 1 x21 . . . x2p . . . . . . . . . . . . 1 xn1 . . . xnp β1 β2 . . . βn + ϵ1 ϵ2 . . . ϵn 2. Y = Xβ + ϵ 3. ϵi ∼ N(0, σ2 ), iid 4. ϵ ∼ N(0, σIn) Estimation: OLS 1. Note that: Xϵ = 0. Thus X T (Y − Xβb) = 0. We have X TY = X TXβb. When det(X TX) ̸= 0, βb = (X TX) −1X TY. 2. Yb = Xβb = XXTX) −1X TY = HY 3. Hat matrix: H2 = H, tr(H) = Phii = p + 1 4. e = Y − Yb = (In − H)Y 5. Cov(e, e) = σ 2 (In − H), V ar(ei) = σ 2 (1 − hii) 6. σb 2 = 1 n−p−1 Pe 2 i 4 of 6
Dayu Wu Applied Statistics Lecture Notes Estimation:MLE 1.Y~N(XB,o21) 2.L=(2m)(a2)-号exp{-a(Y-x8)Ty-1(Y-X3)} Propositions 1.B=(Xx)-XTY 2.E(③=B 3.Var(3)=a2(xrx)-1 4.Gauss-Markov:E(Y)=XB,Var(Y)=a2I 5.Co(a,e)=0 6.Y~N(XB,o21) 7.~N(8,o2(xx)-1) 8.要~X品-p-1 Test 1.:A=…=R=0,F=s心Fp,n-p-1) 2.:属=0,5=易心np- 3.E=号 4.(1-a)CI of:(间-ta2Vcd,+ta2√c Standardization 1场=密 2所=品 3财== Correlation 1.r= 2.r2=5Ee 5of6
Dayu Wu Applied Statistics Lecture Notes Estimation: MLE 1. Y ∼ N(Xβ, σ2 In) 2. L = (2π) − n 2 (σ 2 ) − n 2 exp{− 1 2σ2 (Y − Xβ) TΣ −1 (Y − Xβ)} Propositions 1. βb = (X TX) −1X TY 2. E(βb) = β 3. V ar(β) = σ 2 (X TX) −1 4. Gauss-Markov: E(Y) = Xβ, V ar(Y) = σ 2 In 5. Cov(β, e b ) = 0 6. Y ∼ N(Xβ, σ2 In) 7. βb ∼ N(β, σ2 (X TX) −1 ) 8. SSE σ2 ∼ χ 2 n−p−1 Test 1. H0 : β1 = · · · = βn = 0, F = SSR/p SSE/(n−p−1) ∼ F(p, n − p − 1) 2. H0 : βj = 0, tj = βcj √cjjσb ∼ tn−p−1 3. Fj = t 2 j 4. (1 − α) CI of βj : (βbj − tα/2 √cjjσ, b βbj + tα/2 √cjjσb) Standardization 1. x ∗ ij = x√ ij−x¯j Ljj 2. y ∗ i = √ yi−y¯ Lyy 3. β ∗ j = √ Ljj √ Lyy βbj Correlation 1. r = 1 r12 . . . r1n r21 1 . . . r2n . . . . . . . . . . . . rn1 rn2 . . . 1 2. r 2 y1;2 = SSE(x2)−SSE(x1,x2) SSE(x2) 5 of 6
Dayu Wu Applied Statistics Lecture Notes 3.r123p= 4.m2a=√商 4 Violation of Regression Assumptions Heteroscedasticity 1.r=1-m-zd 2.t=2 Weighted Least Square 1.Loss Function:Q-∑w,e号=∑,(--x,)2 2.B=(XTWX)-XTWY Box-Cox 1.入≠0,y)= 2.=0,Y()=log Y Autocorrelation 1.p=∑24-1/V∑2GV∑2-1 2.DW=∑-2(et-et-)2/∑t-2e≈2(1-m∈0,4 Outlier,High Leverage Point,and Influential Point 1.Outlier:big le,extreme y 2.High Leverage Point:extreme ri 3.Influential Point:result in different regression equations without it 4.extreme X:Cook's distance D= 5.extreme Y:e()= 5 Variable Selection 1.Full Model and Selected Model 2.Criteria:Ra,AIC,Cp 3.Forward,Backward,and Stepwise 6 of 6
Dayu Wu Applied Statistics Lecture Notes 3. r12:3...p = √−∆12 ∆11∆22 4. r12:3 = √ r12−r13r23 (1−r 2 13)(1−r 2 12) 4 Violation of Regression Assumptions Heteroscedasticity 1. rs = 1 − 6 n(n2−1)Pd 2 i 2. t = √ √n−2rs 1−r 2 s Weighted Least Square 1. Loss Function: Q = Pwie 2 i = Pwi(yi − β0 − β1xi) 2 2. βbw = (XTW X) −1XTW Y Box-Cox 1. λ ̸= 0, Y (y) = Y λ−1 λ 2. λ = 0, Y (y) = log Y Autocorrelation 1. ρ = Pn t=2 ϵtϵt−1/ pPn t=2 ϵ 2 t pPn t=2 ϵ 2 t−1 2. DW = Pn t=2(et − et−1) 2/ Pn t=2 e 2 i ≈ 2(1 − ρb) ∈ [0, 4] Outlier, High Leverage Point, and Influential Point 1. Outlier: big |ei |, extreme yi 2. High Leverage Point: extreme xi 3. Influential Point: result in different regression equations without it 4. extreme X: Cook’s distance Di = e 2 i (p+1)σb2 hii (1−hii) 2 5. extreme Y: e(i) = ei 1−hii 5 Variable Selection 1. Full Model and Selected Model 2. Criteria: R2 a , AIC, Cp 3. Forward, Backward, and Stepwise 6 of 6