Chapter 7 Multiple regression Estimation and Hypothesis Testing Multiple Regression Model: A regression model with more than one explanatory variable, multiple because multiple influences (i.e., variables)affect the dependent variable
Chapter 7 Multiple Regression: --Estimation and Hypothesis Testing Multiple Regression Model: A regression model with more than one explanatory variable, multiple because multiple influences (i.e., variables)affect the dependent variable
7.1 The Three-variable Linear regression Model three-variable prf: nonstochastic form: E(Y=B+B2X2t+B3 (71) stochastic form: YtB+B2x2t+B3X3+ut (7.2) E(YD+ut B2, B3partial regression coefficients, partial slope coefficient B the change in the mean value of Y, e(Y), per unit change in X2 holding the value ofx constant B3: the change in the mean value of Y per unit change in X3, holding the value of x2 constant
7.1 The Three-variable Linear Regression Model three-variable PRF: nonstochastic form: E(Yt )=B1+B2X2 t+B3X3t (7.1) stochastic form: Yt=B1+B2X2 t+B3X3t+ut (7.2) = E(Yt )+ut B2 , B3~partial regression coefficients, partialslope coefficient B2: the change in the mean value of Y, E(Y), per unit change in X2, holding the value of X3 constant. B3 : the change in the mean value of Y per unit change in X3 , holding the value of X2 constant
7.2 Assumptions of Multiple Linear Regression Model A7. 1. X and X, are uncorrelated with the disturbance term u A7. 2. The error term u has a zero mean value E(u=0(7.7) A7. 3. Homoscedasticity that is, the variance of u, is constant var(u=o (78) A7.6. For hypothesis testing, the error term u follows the normal distribution with mean zero and (homoscedastic variance o2. That is, u, N(0, 02)(7.10) a7. 4. No autocorrelation exists between the error terms u and u cov(u, u )ij(7.9) A7. 5. No exact collinearity exists between X, and X3; that is there is no exact linear relationship between the two explanatory variables no collinearity or no multicollinearity exact linear relationship (2high or near perfect collinearity
7.2 Assumptions of Multiple Linear Regression Model A7.1. X2 and X3 are uncorrelatedwith the disturbance term u. A7.2. The error term u has a zero mean value E(ui )=0 (7.7) A7.3. Homoscedasticity, that is , the variance of u, is constant: var(ui )=σ2 (7.8) A7.6. For hypothesistesting, the error term u follows the normal distributionwith mean zero and (homoscedastic) variance σ 2 . That is , ui ~N(0, σ 2 ) (7.10 ) A7.4. No autocorrelation exists between the error terms ui and uj : cov(ui , uj ) i≠j (7.9) A7.5. No exact collinearity exists between X2 and X3 ; that is , there is no exact linear relationship between the two explanatory variables. --no collinearity,or no multicollinearity, ①exact linear relationship ②high or near perfect collinearity
7.3 Estimation of Parameters of Multiple Regression 7. 3. 1 Ordinary Least Squares (OLS) Estimators SRF: Stochastic form: Ytb+b2X2t+b3x3t+e. (7.13) Nonstochastic forn:Y:=b1+b2×2t+b3×3t (714) e=Yt-b1-b2X2+b3X3t(7.15) RSS e=∑(Y1-b1-b2x2-b3X (7.16) OLS EStimators Y-b,X-b,X (7.20 x2)∑x3) (7.21) ∑yx2)∑x3)-C∑y1x3)∑x2x3) 7.22) C∑x2)∑x3)-(∑x2x3)
7.3 Estimation of Parameters of Multiple Regression 7.3.1 Ordinary Least Squares(OLS)Estimators SRF: Stochastic form: Yt=b1+b2X2 t+b3X3t+et (7.13) Nonstochastic form: =b1+b2X2 t+b3X3t (7.14) et=Yt- et=Yt –b1-b2X2t+b3X3t (7.15) RSS: (7.16) OLS Estimators: b1= (7.20) b2= (7.21) b3= (7.22) Yt ˆ 2 t 1 2 2 t 3 3 t 2 t e = (Y − b − b X − b X ) Y − b2 X2 − b3 X3 2 2t 3t 2 3t 2 2t t 3t 2t 3t 2 t 2 3t ( x )( x ) ( x x ) ( y x t)( x )( y x )( x x ) − 2 2t 3t 2 3t 2 2t t 3t 2t 3t 2 t 2t 3t ( x )( x ) ( x x ) ( y x )( x ) ( y x )( x x ) − − Yt ˆ
7.3.2 Variance and Standard Errors of ols estimators We need the standard errors for two main purposes: (1) to establish confidence intervals for the true parameter values (2)to test statistical hypotheses. u-N(0, 02)bN(Bl, var(b)) b n( b,N(B,, var(b,)) var(b= X2∑x3+x3∑x21-2x2X∑x2x 2(7.23) n ∑x2∑x-(∑x2x3)2 seb1)=√ar(b1) (724) var(b2)= (7.25) 2t-3t (b2) (7.26 var( (b3) (7.27) C∑x2)∑x3)-C∑x2 (b3)= var(b3) (728)
7.3.2 Variance and Standard Errors of OLS Estimators We need the standard errors for two main purposes: (1) to establish confidence intervals for the true parameter values (2) to test statistical hypotheses. 1. ui ~N(0, σ 2 ) →b1~N(B1 , var(b1 )) b2~N(B2 , var(b2 )) b3~N(B3 , var(b3 )) var(b1 )= · (7.23) se(b1 )= (7.24) var(b2 )= · (7.25) se(b2 )= (7.26) var(b3 )= · (7.27) se(b3 )= (7.28) − + − + 2t 3t 2 2 3t 2 2t 2 3 2t 3t 2 2t 2 3 2 3t 2 2 x x ( x x ) X x X x 2X X x x n 1 2 σ var(b ) 1 2 2t 3t 2 3t 2 2t 2 3t ( x )( x ) ( x x ) x − 2 σ var(b ) 2 2 2t 3t 2 3t 2 2t 2 2t ( x )( x ) ( x x ) x − var(b ) 3 2 σ
2. In practicer 2 is unknown, so we use its estimator 2 then b,. b, b t(n-k) 2∑c(729) 3 (7.30) 7.3.3 Properties of oLS Estimators of Multiple regression BLUE 7.4 An Illustrative Example
2. In practice, is unknown, so we use its estimator, , then b1 , b2 , b3~t(n-k) (7.29) (7.30) 7.3.3 Properties of OLS Estimators of Multiple Regression --BLUE 7.4 An IllustrativeExample 2 σ 2 σ ˆ n 3 e σ 2 2 t − ˆ = 2 σ ˆ = σ ˆ
7.5 Goodness of fit of Estimated Multiple regression: Multiple Coefficient of Determination, R2 Multiple coefficient of determination. R2 TSS=ESS+RSS (733 R (7.34) TSS R2=b 2y,xa+b, 2y x(7.36) ∑ R2 also lies between0 and 1(just as r2) R: coefficient of multiple correlation the degree of linear association between Y and all the X variables jointly R is always taken to be positive. (rcan be positive or negative 7.6 Hypothesis Testing: General Comments:
7.5 Goodness of Fit of Estimated Multiple Regression: Multiple Coefficient of Determination, R2 • Multiple coefficient of determination, R2 TSS=ESS+RSS (7.33) R2= (7.34) R2= (7.36) R2 also lies between 0 and 1(just as r 2) R: coefficient of multiple correlation, the degree of linear association between Y and all the X variables jointly. R is always taken to be positive.(r can be positive or negative) 7.6 Hypothesis Testing: General Comments: TSS ESS + 2 t 2 t 2t 3 t 3t y b y x b y x
7.7 Individual Hypothesis Testing H: B=0 or b, =0 Hypothesis testing: t test df =n-k kthe number of parameters estimated (including the intercept) 7.7.1 The Test of Significance Approach 7.7.2 The Confidence Interval Approach
7.7 Individual Hypothesis Testing: H0 :B2 =0, or B3 =0 Hypothesis testing: t test d.f.=n-k k~the number of parameters estimated(including the intercept) • 7.7.1The Test of Significance Approach • 7.7.2The Confidence Interval Approach
7.8 Joint Hypothesis Testing -Testing the Joint Hypothesis That B, =B2=0 or R2=0 I. Null hypothesis H⌒:B,=B2=0 (745) H:R2=0 (746) Which means that the two explanatory variables together have no influence on Y, means the two explanatory variables explain zero percent of the variation in the dependent variable (1)Why this test? practice, in a multiple regression one or more variables individually have no effect on the dependent variable but collectively they have a significant impact on it
7.8 Joint Hypothesis Testing --Testing the Joint Hypothesis That B2=B3 =0 or R2=0 1.Null hypothesis: H0 :B2=B3 =0 (7.45) H0 :R2=0 (7.46) Which means that the two explanatory variables together have no influence on Y, means the two explanatory variables explain zero percent of the variation in the dependent variable. (1)Why this test? In practice, in a multiple regression one or more variables individually have no effect on the dependent variable but collectively they have a significant impact on it
2. How to test Analysis of variance (ANOVA) a study of the two components of Tss TSS=ESS+RSS 7.33 Under the assumption of the clrm (1) Ha:B,=B2=0 (2) Get a f statistic ESS/df. ESS/(k- F=RSS/df RSS/(n-k) (748) variance explained by X, and X unexplained variance ∑yx+b,∑yx)2(74 ∑e/n
2. How to test Analysis of variance (ANOVA) - - A study of the two components of TSS TSS=ESS+RSS (7.33) Under the assumption of the CLRM : (1) H0 :B2=B3=0 (2) Get a F statistic: F= (7.48) = variance explained by X2 and X3 unexplained variance = (7.49) RSS/(n k) ESS/(k 1) RSS/d.f. ESS/d.f. − − = − + e /(n 3) (b y x b y x )/2 2 t 2 t 2t 3 t 3t