CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL Chapter 2 The Classical Multiple Linear Regression Model 2.1 Linear Regression Model Notation g: dependent variable, regressand a1,.,Cik: independent varia bles, regressors i: index for time. individuals, etc We want to explain gi using il,,,, ik. The multiple linear regression model takes the forn v=1x1+…3xik+E Here Ei(random disturbance, error term denote measurement error l omitted regressors Index i is used for cross-section data, and t for time series data Example 1 Earnings and education earnings=B1+B2educationi+Ei earnings: i-th individual's annual earning ducation;: i-th individual's number of years in school Any problems in the model? Omitted variables such as job erperience, job experience seT, marital status, etc earnings;= B1+B2educationi+B3job eaperience;+ B4job eaperience +Bs;+B6marital status +Ei For ser and marital status, use dummy variables. That 2 f male 0 ff female marital status 1 f married 0 if single See, for example, A shenfelter and Krueger, American Econo mic Review, 1974. 73-85
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 1 Chapter 2 The Classical Multiple Linear Regression Model 2.1 Linear Regression Model Notation: yi : dependent variable, regressand xi1, · · · , xik : independent variables, regressors i : index for time, individuals, etc. We want to explain yi using xi1, · · · , xik. The multiple linear regression model takes the form yi = β1xi1 + · · · βkxik + εi . Here εi (random disturbance, error term) denotes measurement error omitted regressors Index i is used for cross-section data, and t for time series data. Example 1 Earnings and education earningsi = β1 + β2educationi + εi earningsi : i − th individual’s annual earning educationi : i − th individual’s number of years in school Any problems in the model? Omitted variables such as job experience, job experience2 , sex, marital status, etc. earningsi = β1 + β2 educationi + β3 job experiencei + β4 job experience2 i +β5 sexi + β6marital statusi + εi For sex and marital status, use dummy variables. That is, sexi = 1 if male = 0 if female marital statusi = 1 if married = 0 if single See, for example, Ashenfelter and Krueger, American Economic Review, 1974, 73—85
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 2 Example 2 Class attendance and test sco res B1+B2 fraction of lectures attended +B3 ffraction of problem sets completed+ ei See Romer, Journal of Econo mic Perspectives, 1998 2.2 Classical Assumptions Linearity B1=1+…+Bk=K+E1(1=1+…+ 6+ 1 +B where = (l=1+…+K) XB+E K X 721 Loglinear model ln:=61+2ln=+……+Bkln=x+E aIn Br(const ant elasticity)
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 2 Example 2 Class attendance and test scores scorei = β1 + β2 (fraction of lectures attended) i +β3 (fraction of problem sets completed) i + εi See Romer, Journal of Economic Perspectives, 1993. 2.2 Classical Assumptions 1. Linearity yi = β1xi1 + · · · + βkxiK + ε1 (i = 1, · · · , n) = x ′ iβ + εi where xi = xi1 . . . xiK , β = β1 . . . βK or y = x1β1 + · · · + xKβK + ε where y = y1 . . . yn , xl = x1l . . . xnl (l = 1, · · · , K) ε = ε1 . . . εn or y = Xβ + ε where X = x11 · · · x1K . . . xn1 · · · xnK , β = β1 . . . βK Loglinear model: ln y = β1 + β2 ln x2 + · · · + βK ln xK + ε ∂ ln y ∂ ln xk = βk (constant elasticity)
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REG RESSION MODEL 3 de ut a XB r st Per period growth rate of yt not explained by Xt d de 2. Full rank X is an nx K matrix with rank K The columns of X are linearly independent Obviously, we should have n> K If this assumption is violated, X contains redundant information What if this assumption is violated? Suppose that yaB1TA2X1rB3X2T∈ and Then, the ordinary least squares estimat or does not exist. When X is of deficient rank, we say that there is a multicollinearity problem 3. Zero conditional mean of the disturbance ELX E gEXsa EgnOs No observation on X convey information a bout the expect ed value of the dist ur- bance.The assumpt ion implies E gisa 4. Spherical disturbances V Cov gi,EiIXsa i for all i d
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 3 Semilog model: ln yt = X ′ tβ + δt + εt Per period growth rate of yt not explained by Xt is d ln y dt = δ 2. Full rank X is an n × K matrix with rank K. • The columns of X are linearly independent. • Obviously, we should have n ≥ K. • If this assumption is violated, X contains redundant information. What if this assumption is violated? Suppose that y = β1 + β2X1 + β3X2 + ε and X1 = αX2. Then, the ordinary least squares estimator does not exist. When X is of deficient rank, we say that there is a multicollinearity problem. 3. Zero conditional mean of the disturbance E (ε|X) = E (ε1|X) . . . E (εn|X) = 0. No observation on X convey information about the expected value of the disturbance. The assumption implies E (εi) = 0 and Cov (εi , X) = 0. 4. Spherical disturbances V ar (εi |X) = σ 2 for all i = 1, 2, · · · , n Cov (εi , εj |X) = 0 for all i = j
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 4 The Eelo s Egee(o E E gee(o s a Var cet a g l The assumption of common variance for e is called homoskedasticity
CHAPTER 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL 4 These imply E (εε ′ |X) = E (ε 2 1 |X) E (ε1ε2|X) · · · E (ε1εn|X) . . . E (ε 2 n |X) = σ 2 I and V ar [ε] = σ 2 I. The assumption of common variance for εi is called homoskedasticity