Chapter 13 Model selection: Criteria and Tests
Chapter 13 Model Selection: Criteria and Tests
One clrm assumption is: The model used in empirical analysis is correctly specified
One CLRM assumption is: The model used in empirical analysis is “correctly specified
Correct specification"of a model means No theoretically relevant variable has been excluded from the model No unnecessary or irrelevant variables are included in the model The functional form of the model is correct
No theoretically relevant variable has been excluded from the model. No unnecessary or irrelevant variables are included in the model. The functional form of the model is correct “Correct specification ”of a model means:
13.1 The Attributes of a good model Criteria to judge a model: Principle of parsimony A model should be kept as simple as possible 2. Identifiability For a given set of data the estimated parameters must have unique values 3. Goodness of fit Model is judged good by the higher adjusted R(=R2)
13.1 The Attributes of a Good Model ——Criteria to judge a model: 1. Principle of parsimony A model should be kept as simple as possible. 2. Identifiability For a given set of data the estimated parameters must have unique values 3. Goodness of fit. Model is judged good by the higher adjusted R2 (= ) 2 R
4.Theoretical consistency In constructing a model we should have some theoretical underpinning 5. Predictive power Choose the model whose theoretical predictions are borne out by actual experience
4. Theoretical consistency In constructing a model we should have some theoretical underpinning 5. Predictive power Choose the model whose theoretical predictions are borne out by actual experience
13.2 Types of Specification Errors 1. Omitting a Relevant Variable: Underfitting or "Underspecifying"a Model True model:Yt=B1+B2×2+B3×3t+pt(13.1) Misspecified model:Y =A1+A2X2t+ut(13.2)
13.2 Types of Specification Errors 1.Omitting a Relevant Variable: “Underfitting” or “Underspecifying” a Model True model: Yt=B1+B2X2t+B3X3t+μt (13.1) Misspecified model: Yt=A1+A2X2t+μt (13.2)
The consequences of omitting variable bias(X3) (1 If X, X, are correlated: Oa, and a, are biased ar, a, can have an upward or downward bias E(a1)≠B1E(a1)=B1+B3(X3-b2X2)(13.4) E(a2)+B2 E(a2)=B2+B3 b32 o ar and a, are inconsistent (2)f×2and×3 are not correlated a2 is unbiased, consistent, b32 will be zero a, biased, unless X, is zero in the model(13. 4)
(1)If X2 ,X3 are correlated: ◎a1 and a2 are biased, a1 , a2 can have an upward or downward bias E(a1 )≠B1 E(a1 )= B1 +B3( (13.4) E(a2 ) ≠B2 E(a2 )= B2 +B3b32 ◎ a1 and a2 are inconsistent. (2)If X2 and X3 are not correlated a2 is unbiased, consistent, b32 will be zero a1 biased, unless is zero in the model(13.4) X b X ) 3 − 32 2 The consequences of omitting variable bias (X3 ) X3
(3) The error variance estimated from the misspecified model is a biased estimator of the true error variance g 2 The conventionally estimated variance of a2 is a biased estimator of the variance of the true estimator b2 B 2 .E[var(a2)]=var(b2)+ 2 21 ' Var(a2)will overestimate the true variance of b2, that is, it will have a positive bias. (4) The usual confidence interval and hypothesis-testing procedures are unreliable. The confidence interval will be wider
(3)The error variance estimated from the misspecified model is a biased estimator of the true error variance σ2 ——The conventionally estimated variance of a2 is a biased estimator of the variance of the true estimator b2 ∵ E[var(a2 )]=var(b2 )+ ∴Var(a2 ) will overestimate the true variance of b2 , that is, it will have a positive bias. (4)The usual confidence interval and hypothesis-testing procedures are unreliable. The confidence interval will be wider. 2 2i 2 3i 2 3 (n - 2) x B x
2. Inclusion of irrelevant variables Overfitting a Model Inclusion of irrelevant variables will certainly increase R2, which might increase the predictive power of the model True mode:Y=B1+B2X2i+u (13. 9) Misspecified model Y=A1+A2×2+A3X3+v;(13.10)
Inclusion of irrelevant variables will certainly increase R2 ,which might increase the predictive power of the model. True model: Yi=B1+B2X2i+μi (13.9) Misspecified model: Yi=A1+A2X2i+ A3X3i+ vi (13.10) 2.Inclusion of Irrelevant Variables: —— “Overfitting” a Model
Consequences of inclusion of irrelevant variables in a model IThe Ols estimators are unbiased and consistent. E(al= b E(a? E(a3)=0 (2)The estimator of o 2 is correctly estimated (3) The standard confidence interval and hypothesis testing procedure on the basis of the t and f tests remain valid
(1)The OLS estimators are unbiased and consistent. E(a1 ) = B1 E(a2 ) = B2 E(a3 ) = 0 (2)The estimator of σ2 is correctly estimated. (3)The standard confidence interval and hypothesistesting procedure on the basis of the t and F tests remain valid. Consequences of inclusion of irrelevant variables in a model: