Chapter 6 The Two-Variable model Hypothesis Testing
Chapter 6 The Two-Variable Model: Hypothesis Testing
The Object of Hypothesis Testing To answer How"“good” is the estimated regression line How can we be sure that the estimated regression function (i. e,, the SRF) is in fact a good estimator of the true PRF? Y=B, +BX+u nonstochastic stochast stochastic a Before we tell how good an SRF is as an estimate of the true Pre. we should assume how the stochastic u terms are generated
The Object of Hypothesis Testing To answer—— ◼ How “good” is the estimated regression line. ◼ How can we be sure that the estimated regression function (i.e., the SRF) is in fact a good estimator of the true PRF? Yi=B1+B2Xi+μi Xi———nonstochastic μi———stochasti Yi———stochastic ◼ Before we tell how good an SRF is as an estimate of the true PRF, we should assume how the stochastic μterms are generated
6.1 The Classical Linear Regression Model(CLRM) CLRM assumptions A6. 1. The explanatory variable(s)X is uncorrelated with the disturbance term u A6.2. Zero mean value assumption The expected, or mean, value of the disturbance term u is zero E()=0 A6.3. Homoscedasticity assumption The variance of each H is constant, or homoscedastic (p1) 6.2) A6. 4. No autocorrelation assumption There is no correlation between two error terms cov(upu1)=01≠j(6.3)
6.1 The Classical Linear Regression Model (CLRM) CLRM assumptions: ◼ A6.1. The explanatory variable(s) X is uncorrelated with the disturbance term μ. ◼ A6.2. Zero mean value assumption: ◼ ——The expected, or mean, value of the disturbance term μ is zero. E(μi )=0 (6.1) ◼ A6.3. Homoscedasticity assumption: ◼ ——The variance of each μi is constant, or homoscedastic. var(μi )=σ2 (6.2) ◼ A6.4. No autocorrelation assumption: ◼ ——There is no correlation between two error terms. ◼ cov(μi ,μj )=0 i≠j (6.3)
6.2 Variandes and standard Errors of Ordinary Least Squares (OLS) Estimators Study the sampling variability ofOLS estimators The variances and standard errors of the ols estimators var(b1=2X.02 (b1)=var(b1) va(b2×2 se(b2)=/var(b (6.7) 02 IS an estimator of o 2 (6.8) 2 n-2 (6.9) ∑e2=RSs( (residual sum of squares=∑(YrY1) degrees of freedom
6.2 Variandes and Standard Errors of Ordinary Least Squares(OLS)Estimators ◼ ——Study the sampling variability of OLS estimators. ◼ The variances and standard errors of the OLS estimators: var(b1 )= ·σ2 (6.4) se(b1 ) = (6.5) var(b2 )= (6.6) se(b2 ) = (6.7) is an estimator of σ2 (6.8) (6.9) ∑ei 2=RSS(residualsum of squares) =∑(Yi -Yi)2 n-2……..degrees of freedom 2 i 2 i n X X var(b ) 1 2 i 2 X σ var(b ) 2 σ ˆ 2 n 2 e σ 2 i 2 − ˆ = 2 σ ˆ = σ ˆ
6.3 The Properties of OLS Estimators Why Ordinary Least Squares (ols)? The OLS method is used popularly because it has some very strong theoretical properties which is known as the Gauss-Markov theoren Gauss-Markov theorem Given the assumptions of the classical linear regression model. the ols estimators. in the class of unbiased linear estimators have minimum variance; that is, they are BLUe(best linear unbiased estimators) That is the ols estimators b, and b are 1. Linear: they are linear functions of the random variable y Unbiased: E(b1)=B, E(b2)=B E(2)=0 ave minimum varlance
6.3 The Properties of OLS Estimators Why Ordinary Least Squares(OLS)? The OLS method is used popularly because it has some very strong theoretical properties, which is known as the Gauss-Markov theorem: ◼ Gauss-Markov theorem: ——Given the assumptions of the classical linear regression model, the OLS estimators, in the class of unbiased linear estimators, have minimum variance; that is, they are BLUE(best linear unbiased estimators). That is , the OLS estimators b1 and b2 are: ◼ 1. Linear: they are linear functions of the randomvariable Y. ◼ 2. Unbiased: E( b1 )=B1 E( b2 )=B2 E( )= ◼ 3. Have minimum variance. σ ˆ 2 σ 2
6.4 The Sampling, or Probability, Distributions of OLS Estimators 1. One more assumption of the CLRM needed A6.5. In the PRF Y=B+B2X+Wi the error term Hi follows the normal distribution with mean zero and variance 2. That is H N(O, 2)(6.17) Central limit theorem If there is a large number of independent and identical distributed random variables, then, with a few exceptions, the distribution of their sum tends to be a normal distribution as the number of such variables increases indefinitely 2. b, and b, follow normal distribution u follows the distribution --b, and b, are linear functions of the normally distributed variable H b, and b, are normally distributed bNB c07 (6.18) 2=var( (1)=x (6.4) b,N B,,O (6.19) 02=var(b2) (66) b X
6.4 The Sampling , or Probability, Distributions of OLS Estimators ◼ 1.One more assumption of the CLRM.needed: A6.5. In the PRF Yi=B1+B2Xi+μi, the error term μi follows the normal distributionwith mean zero and variance . That is μi~N(0, ) (6.17) ◼ Central limit theorem: ——If there is a large number of independent and identically distributed random variables, then, with a few exceptions, the distribution of their sum tends to be a normal distribution as the number of such variables increases indefinitely. ◼ 2. b1 and b2 follownormal distribution ∵---μ followsthe distribution ---b1 and b2 are linear functions of the normally distributed variable μ, ∴b1 and b2 are normally distributed. b1~N(B1 , ) (6.18) =var(b1 )= (6.4) b2~N(B2 , ) (6.19) =var(b2 )= (6.6) 2 σ σ 2 2 b1 σ 2 b1 σ 2 i 2 i n χ X 2 b2 σ 2 b2 σ 2 i 2 χ σ
6.5 Hypothesis Testing 1. The confidence interval approach 2. The test of significance approach 1. t statistic b-B 0 known, Z se(b2) N(0,1) 02 unknown, we can estimate o 2 by using(2 b-B b-B (6.21) X 2. The Confidence Interval Approach (1)H:B2=0 H1:B2≠0 (2)Establish a 100(1-a)confidence interval for B P(t /2
6.5 Hypothesis Testing ◼ 1. The confidence interval approach 2. The test ofsignificanceapproach ◼ 1. t statistic ◼ known, ~N(0,1) unknown, we can estimate by using = ~tn-2 (6.21) ◼ 2. The Confidence Interval Approach. (1) H0 : B2=0 H1 : B2≠0: (2)Establish a 100(1-α) confidence interval for B2 P(-tα/2≤t≤tα/2 ) =1-α σ 2 − = − = 2 i 2 2 2 2 2 σ/ χ b B se(b ) b B Z σ 2 σ 2 σ ˆ 2 se(b ) b B 2 2 − 2 − 2 i 2 2 σ / χ b B ˆ
(6.24) b-B ∑x ≤B,≤b X P[b2-tseb2)≤B2≤b2+tseb2)=1-a626 (3) Decision If this interval (i.e, the acceptance region) includes the null hypothesized value of B, we do not reject the hypothesis If it lies outside the confidence interval (i.e, it lies in the rejection region), we reject the null hypothesis A Cautionary Note: p166
(6.24) (6.25) (6.26) ( 3)Decision: · If this interval (i.e., the acceptance region) includes the nullhypothesized value of B2 , we do not reject the hypothesis. ·If it lies outside the confidence interval (i.e., it lies in the rejection region), we reject the null hypothesis. A CautionaryNote: p166 t 1 α σ / χ b B P t α/2 2 i 2 2 α/2 = − − − ˆ = − − + 1 χ t σ B b χ t σ P b 2 i α/2 2 2 2 i α/2 2 ˆ ˆ P(b2 − t α/2 se(b2 ) B2 b2 + t α/2 se(b2 ) =1−α
a 3. The Test of Significance Approach (1) Atwo-tailed test Assume Ho: B2=0 and H: B2#0 Set t statistic: t= b2 B2(6.29) se(b,) Check t table to get the critical t value, if I t >critical t value, reject H Check the p value of the t statistic, if p <level of significance, reject (2) Aone-tailed test Ho: B2=0 and H: B2 <0 T test procedure is the same as the two-tailed test, just:
◼ 3. The Test of Significance Approach 〈1〉 t test (1) A two-tailed test. Assume H0 :B2=0 and H1 :B2 ≠0. Set t statistic: t= (6.29) Check t table to get the critical t value, if∣t∣>critical t value, reject H0 Check the p value of the t statistic, if p <level of significance , reject H0 (2)Aone-tailed test. H0 :B2=0 and H1 :B2<0; T test procedure is the same as the two-tailed test, just: tα/2 → tα se(b ) b B 2 * 2 − 2
〈2 xXcritical X value, reject Ho Find the P value of the x statistic, p value>level of significance accept Ho
◼ 〈2〉χ2 test ·H0 : ·Get χ2 statistic: ~ (6.31) ·Check χ2 table, Find the critical χ2 value, χ2>criticalχ2 value, reject H0 Find the P value of the χ2 statistic, p value>level of significance, accept H0 2 0 2 σ =σ 2 2 σ (n - 2)σˆ 2 X(n−2)