Multiple regression analysis y=Bo B Bx+ Bx +.Bkk+u 27. Specification and Data Problems Economics 20- Prof anderson
Economics 20 - Prof. Anderson 1 Multiple Regression Analysis y = b0 + b1 x1 + b2 x2 + . . . bk xk + u 7. Specification and Data Problems
Functional form o We've seen that a linear regression can really fit nonlinear relationships Can use logs on rhs, lhs or both o Can use quadratic forms ofx's Can use interactions ofx's o How do we know if we' ve gotten the right functional form for our model? Economics 20- Prof anderson
Economics 20 - Prof. Anderson 2 Functional Form We’ve seen that a linear regression can really fit nonlinear relationships Can use logs on RHS, LHS or both Can use quadratic forms of x’s Can use interactions of x’s How do we know if we’ve gotten the right functional form for our model?
Functional Form(continued) o First, use economic theory to guide you Think about the interpretation e Does it make more sense for x to affect y percentage(use logs)or absolute terms Does it make more sense for the derivative of x, to vary with x,(quadratic) or with x2 (interactions)or to be fixed? Economics 20- Prof anderson
Economics 20 - Prof. Anderson 3 Functional Form (continued) First, use economic theory to guide you Think about the interpretation Does it make more sense for x to affect y in percentage (use logs) or absolute terms? Does it make more sense for the derivative of x1 to vary with x1 (quadratic) or with x2 (interactions) or to be fixed?
Functional Form(continued) e We already know how to test joint exclusion restrictions to see if higher order terms or interactions belong in the model It can be tedious to add and test extra terms lus may find a square term matters when really using logs would be even better A test of functional form is Ramsey's regression specification error test (reset) Economics 20- Prof anderson 4
Economics 20 - Prof. Anderson 4 Functional Form (continued) We already know how to test joint exclusion restrictions to see if higher order terms or interactions belong in the model It can be tedious to add and test extra terms, plus may find a square term matters when really using logs would be even better A test of functional form is Ramsey’s regression specification error test (RESET)
Ramsey's reset RESET relies on a trick similar to the special form of the White test o Instead of adding functions of the x's directly, we add and test functions of y ◆So, estimate y=β0+βx1+….+B1x+ 8,y2+8y +error and test o Ho: 8=0,82=0 using F-F2 or LM-X2 Economics 20- Prof anderson 5
Economics 20 - Prof. Anderson 5 Ramsey’s RESET RESET relies on a trick similar to the special form of the White test Instead of adding functions of the x’s directly, we add and test functions of ŷ So, estimate y = b0 + b1 x1 + … + bk xk + d1 ŷ 2 + d1 ŷ 3 +error and test H0 : d1 = 0, d2 = 0 using F~F2,n-k-3 or LM~χ2 2
Nonnested alternative tests o If the models have the same dependent variables, but nonnested x's could still just make a giant model with the x's from both and test joint exclusion restrictions that lead to one model or the other An alternative. the Davidson -MacK innon test, uses y from one model as regressor in the second model and tests for significance Economics 20- Prof anderson 6
Economics 20 - Prof. Anderson 6 Nonnested Alternative Tests If the models have the same dependent variables, but nonnested x’s could still just make a giant model with the x’s from both and test joint exclusion restrictions that lead to one model or the other An alternative, the Davidson-MacKinnon test, uses ŷ from one model as regressor in the second model and tests for significance
Nonnested Alternatives(cont) o More difficult if one model uses y and the other uses In() Can follow same basic logic and transform predicted In() to get y for the second step o In any case, Davidson-MacKinnon test may reject neither or both models rather than clearly preferring one specification Economics 20- Prof anderson 7
Economics 20 - Prof. Anderson 7 Nonnested Alternatives (cont) More difficult if one model uses y and the other uses ln(y) Can follow same basic logic and transform predicted ln(y) to get ŷ for the second step In any case, Davidson-MacKinnon test may reject neither or both models rather than clearly preferring one specification
Proxy variables o What if model is misspecified because no data is available on an important x variable? e It may be possible to avoid omitted variable bias by using a proxy variable a proxy variable must be related to the unobservable variable- for example: x? 80 +83x3+v3, where implies unobserved e Now suppose we just substitute x3 for x3* Economics 20- Prof anderson 8
Economics 20 - Prof. Anderson 8 Proxy Variables What if model is misspecified because no data is available on an important x variable? It may be possible to avoid omitted variable bias by using a proxy variable A proxy variable must be related to the unobservable variable – for example: x3* = d0 + d3 x3 + v3 , where * implies unobserved Now suppose we just substitute x3 for x3*
Proxy variables(continued) What do we need for for this solution to give us consistent estimates of B, and B2? E(x3*|x1,x2,x3)=E(x3*x3)=8+63x3 That is. u is uncorrelated with x, x and x, x and v3 is uncorrelated with x1, x2 and x3 ◆ So really running y=(+B3④)+B1x1+ B2x2+ B38x3+(u+B3v3)and have just redefined intercept. error term x2 coefficient Economics 20- Prof anderson 9
Economics 20 - Prof. Anderson 9 Proxy Variables (continued) What do we need for for this solution to give us consistent estimates of b1 and b2 ? E(x3* | x1 , x2 , x3 ) = E(x3* | x3 ) = d0 + d3 x3 That is, u is uncorrelated with x1 , x2 and x3* and v3 is uncorrelated with x1 , x2 and x3 So really running y = (b0 + b3d0 ) + b1 x1+ b2 x2 + b3d3 x3 + (u + b3 v3 ) and have just redefined intercept, error term x3 coefficient
Proxy variables(continued) Without out assumptions can end up with biased estimates ◆Sayx3*=8o+81x1+62x2+82x3+v e Then really running y=(Bo+ B35)+(B+ B31)x计+(2+B302)x2+B3O3x3+(+B3v3) e Bias will depend on signs of B3 and This bias may still be smaller than omitted variable bias, though Economics 20- Prof anderson 10
Economics 20 - Prof. Anderson 10 Proxy Variables (continued) Without out assumptions, can end up with biased estimates Say x3* = d0 + d1 x1 + d2 x2 + d3 x3 + v3 Then really running y = (b0 + b3d0 ) + (b1 + b3d1 ) x1+ (b2 + b3d2 ) x2 + b3d3 x3 + (u + b3 v3 ) Bias will depend on signs of b3 and dj This bias may still be smaller than omitted variable bias, though