第十二章有序分类资料的统计分析的 Stata实现 本章使用的 STATA命令: 列变量有序时的分类资料CMH卡方分析 partch行变量 weight, by(列变量) (见 Stata7附加程序) 双向有序时的 Spearman相关 spearman变量1变量2 例12-2某研究欲观察人参的镇静作用,选取32只同批次的小白鼠,将其 中20只随机分配到人参组:以5%人参浸液对其做腹腔注射,12只分配到对照 组:以等量蒸馏水对其做同样注射。实验结果如表12-2所示。能否说明人参有 镇静作用? 表12-2人参镇静作用的实验结果 镇静等 人参组 4 11 2 1.建立检验假设,确定检验水准 Ho:人参没有镇静作用(样本来自两个相同总体) H1:人参有镇静作用(样本来自两个不同总体) a=005 Stata数据为: b1234512345 2 11 0 Stata命令为
第十二章有序分类资料的统计分析的 Stata 实现 本章使用的 STATA 命令: 列变量有序时的分类资料 CMH 卡方分析 opartchi 行变量 [weight], by(列变量) (见 Stata7 附加程序) 双向有序时的 Spearman 相关 spearman 变量 1 变量 2 例 12-2 某研究欲观察人参的镇静作用,选取 32 只同批次的小白鼠,将其 中 20 只随机分配到人参组:以 5%人参浸液对其做腹腔注射,12 只分配到对照 组:以等量蒸馏水对其做同样注射。实验结果如表 12-2 所示。能否说明人参有 镇静作用? 表 12-2 人参镇静作用的实验结果 镇静等级 人参组 对照组 - 4 11 ± 1 0 + 2 1 ++ 1 0 +++ 12 0 1.建立检验假设,确定检验水准。 H0 :人参没有镇静作用(样本来自两个相同总体) H1 :人参有镇静作用(样本来自两个不同总体) = 0.05 Stata 数据为: a b x 1 1 4 1 2 1 1 3 2 1 4 1 1 5 12 2 1 11 2 2 0 2 3 1 2 4 0 2 5 0 Stata 命令为:
opartchi b [weight=x), by(a) 结果为 Chi-square test Chi-square p-value Independence 4 16.640.0023 Components of independence test Location 15.290.0001 Dispersion 34960.5543 在a=005的水平上,拒绝H0,接受H,认为两总体之间的差别有统计学 意义,可以认为人参组和对照组镇静等级的差别有统计学意义,人参有镇静作用。 例12-3试根据表124的资料,检验针刺不同穴位的镇痛效果有无差别? 表12-4针刺不同穴位的镇痛效果 穴位 镇痛效果 + +++ 38 足三里 16 扶突 33 1.建立检验假设,确定检验水准 Ho:三个穴位的镇痛效果相同 H1:三个穴位的镇痛效果不全相同 Stata数据为: group effect W34 341234 8424398673 19
opartchi b [weight=x],by(a) 结果为: Chi-square tests df Chi-square P-value Independence 4 16.64 0.0023 ------------------------------------------------------- Components of independence test Location 1 15.29 0.0001 Dispersion 1 .3496 0.5543 在 = 0.05 的水平上,拒绝 H0 ,接受 H1,认为两总体之间的差别有统计学 意义,可以认为人参组和对照组镇静等级的差别有统计学意义,人参有镇静作用。 例 12-3 试根据表 12-4 的资料,检验针刺不同穴位的镇痛效果有无差别? 表 12-4 针刺不同穴位的镇痛效果 穴位 镇痛效果 + ++ +++ ++++ 合谷 38 44 12 24 足三里 53 29 28 16 扶突 47 23 19 33 1.建立检验假设,确定检验水准。 H0 :三个穴位的镇痛效果相同 H1 :三个穴位的镇痛效果不全相同 = 0.05 Stata 数据为: group effect w 1 1 38 1 2 44 1 3 12 1 4 24 2 1 53 2 2 29 2 3 28 2 4 16 3 1 47 3 2 23 3 3 19
3 33 Stata命令为: opartchi effect weight=w, by group) 结果为: df Chi-square p-value Independence 6 22.070.0012 Components of independence test Location 2 3.1180.2103 Dispersion 5.9140.0520 有P>005。在a=005的水平上尚不能拒绝H,即根据本例资料尚不能认 为针刺不同穴位的镇痛效果差别有统计学意义 例12-4两名放射科医师对13张肺部X线片各自做出评定,评定方法是将 Ⅹ线片按病情严重程度给出等级,结果如表12-6所示。问他们的评定结果是否 相关 表12-6两名放射科医师对13张肺部X片的评定结果 甲医师X+++ ++++++++++ 乙医师Y±+++ +++++++++++++±++++ 1.建立检验假设,确定检验水准。 H0:p,=0(两名医师的等级评定结果不相关 H1:p,≠0(两名医师的等级评定结果相关) 0.05 Stata数据为: 1 0
3 4 33 Stata 命令为: opartchi effect [weight=w],by(group) 结果为: Chi-square tests df Chi-square P-value Independence 6 22.07 0.0012 ------------------------------------------------------- Components of independence test Location 2 3.118 0.2103 Dispersion 2 5.914 0.0520 有 P 0.05 。在 = 0.05 的水平上尚不能拒绝 H0 ,即根据本例资料尚不能认 为针刺不同穴位的镇痛效果差别有统计学意义。 例 12-4 两名放射科医师对 13 张肺部 X 线片各自做出评定,评定方法是将 X 线片按病情严重程度给出等级,结果如表 12-6 所示。问他们的评定结果是否 相关。 表12-6 两名放射科医师对13张肺部X片的评定结果 X片编号 1 2 3 4 5 6 7 8 9 10 11 12 13 甲医师X + ++ - - + ++ +++ ++ +++ - ++ + 乙医师Y ++ + + - ++ +++ ++ +++ +++ ++ ++ 1.建立检验假设,确定检验水准。 H0 : s =0(两名医师的等级评定结果不相关) H1: s 0 (两名医师的等级评定结果相关) = 0.05 Stata 数据为: i x y 1 2 1 2 3 3 3 0 2 4 1 2 5 0 0
6789 34344 10 2343403 133 13 Stata命令为 结果为: Number of obs Spearman s rho 0.8082 Test of Ho: y and x are independent Prob>t= 0.0008 P<0.05。在a=005水平上,拒绝H,接受H,认为两名射科医师对肺部Ⅹ 片的等级评定结果有正相关关系。 例12-5300名抑郁症患者按其抑郁程度和自杀意向的轻重程度的分类数据 如表12-8所示。问自杀意向的轻重程度和抑郁程度之间是否存在线性变化趋势? 表12-8300名抑郁症患者的分类数据 抑郁程度(Y) 自杀意向(X) 轻度 中等 严重 合计 (=2) (=3) 无 (X=1) 135 73 14 22 想要自杀X=2) 10 曾自杀过(X=3) 合计 300 1、建立检验假设,确定检验水准。 Ho:自杀意向与抑郁症之间不存在线性变化趋势 H1:自杀意向与抑郁症之间存在线性变化趋势 C=0.05 Stata数据为
6 2 3 7 3 4 8 4 3 9 3 4 10 4 4 11 0 1 12 3 3 13 2 3 Stata 命令为: spearman y x 结果为: Number of obs = 13 Spearman's rho = 0.8082 Test of Ho: y and x are independent Prob > |t| = 0.0008 P<0.05。在 = 0.05 水平上,拒绝 H0,接受 H1,认为两名射科医师对肺部 X 片的等级评定结果有正相关关系。 例12-5 300名抑郁症患者按其抑郁程度和自杀意向的轻重程度的分类数据 如表12-8所示。问自杀意向的轻重程度和抑郁程度之间是否存在线性变化趋势? 表12-8 300名抑郁症患者的分类数据 自杀意向(X) 抑郁程度(Y) 轻度 中等 严重 合计 (Y=1) (Y=2) (Y=3) 无 (X=1) 135 73 14 222 想要自杀 (X=2) 5 9 10 24 曾自杀过 (X=3) 8 23 23 54 合计 148 105 47 300 1、建立检验假设,确定检验水准。 H0 :自杀意向与抑郁症之间不存在线性变化趋势 H1 :自杀意向与抑郁症之间存在线性变化趋势 = 0.05 Stata 数据为: y x w
22333 1231231 23 14 23 Stata命令和结果为*: corr y x [fw=w] (obs=300) 11.0000 0.46421.0000 return list scalars r(N)=300 r(rho)=.4641844307421609 display (r(n)-1)*r (rho)2 64.424689 在a=005水平上,拒绝H0,接受H,可以推断自杀意向的轻重程度与抑 郁症之间存在线性变化趋势 *参考文献 Does Stata provide a test for trend? A comparison of different tests for trend Author William Sribney, StataCorp Date Let me make a bunch of comments comparing SAS PROC FREQ, Pearson's correlation, Patrick Royston's ptrend command, linear regression, logit/probit regression, Stata's vwls command, and Stata,s ptrend Tests for trend in 2 xr tables
1 1 135 1 2 5 1 3 8 2 1 73 2 2 9 2 3 23 3 1 14 3 2 10 3 3 23 Stata 命令和结果为*: . corr y x [fw=w] (obs=300) | y x -------------+------------------ y | 1.0000 x | 0.4642 1.0000 . return list scalars: r(N) = 300 r(rho) = .4641844307421609 . display (r(N)-1)*r(rho)^2 64.424689 在 = 0.05 水平上,拒绝 H0,接受 H1,可以推断自杀意向的轻重程度与抑 郁症之间存在线性变化趋势。 *参考文献: Does Stata provide a test for trend? Title A comparison of different tests for trend Author William Sribney, StataCorp Date March 1996 Let me make a bunch of comments comparing SAS PROC FREQ, Pearson’s correlation, Patrick Royston’s ptrend command, linear regression, logit/probit regression, Stata’s vwls command, and Stata’s nptrend command. Tests for trend in 2 x r tables
Let me use Les Kalish s example: Outcome group et ter Best yi|a_1-=1|a_2=2 3=3 y1=0|193167 n11 n13 22|n23 n n+3 PROC FREQ When I used SAS in grad school to analyze these data, we used PROC FREQ DATA=. TABLES GROUP*OUTCOME/ CHISQ CMH SCORES=TABLE The test statistic was shown on the output as Value Prob Mantel-Haenszel Chi-square 4.515 0.034 The test statistic is not a Mantel-Haenszel-at least not according to what I learned a Mantel-Haenszel statistic is(from Gary Koch at UNC--note that any errors here, I should add, are those of this student, not of this great researcher/teacher
Let me use Les Kalish’s example: Outcome ---------------------------- group | Good | Better | Best y_i | a_1=1| a_2=2 | a_3=3 ------+--------+--------+--------- | | | y_1=0| 19 | 31 | 67 | n_11 | n_12 | n_13 | | | y_2=1| 1 | 5 | 21 | n_21 | n_22 | n_23 | | | ------+--------+--------+--------- 20 36 88 n_+1 n_+2 n_+3 PROC FREQ When I used SAS in grad school to analyze these data, we used PR0C FREQ DATA=... TABLES GROUP*OUTCOME / CHISQ CMH SCORES=TABLE The test statistic was shown on the output as DF Value Prob --- ----- ----- Mantel-Haenszel Chi-Square 1 4.515 0.034 The test statistic is not a Mantel–Haenszel—at least not according to what I learned a Mantel–Haenszel statistic is (from Gary Koch at UNC—note that any errors here, I should add, are those of this student, not of this great researcher/teacher)
Dr. Koch called this chi-squared statistic Qs, where s stands for score. Chi-squared statistic for trend Qs Let me express Qs in terms of a simpler statistic, T T=(sum over group i)(sum over outcome j) ni *k aj *k yi The aj are scores; here 1, 2, 3, but there can be other choices for the scores(Ill get to this later) Under the null hypothesis there is no association between group and outcome, so we can consider the permutation(i.e, randomization) distribution of T That is, we fix the margins of the table, just as we do for Fisher's exact test, and then consider all the possible permutations that give these same marginal counts. Under this null hypothesis permutation distribution, it is easy to see that the mean of T is E(T)=N*a bar *k y bar where a bar is the weighted average of aj(using the marginal counts n ij) a bar =(sum over j)n+ *k a/N Similarly, y bar is a weighted average of yi. The variance of T, under the permutation distribution, is(exactly) V(T)=(N-1)*S2*Sy2 where Sa is the stand ard deviation squared for aj Sa =(1/(N-1))*(sum over j)n+*(a,-a bar) We can compute a chi-squared statistic Qs =(T-E(T))/ V(T) f you look at the formula for Qs, you see something interesting. It is simply Qs=(N-1)*r2
Dr. Koch called this chi-squared statistic Qs, where s stands for score. Chi-squared statistic for trend Qs Let me express Qs in terms of a simpler statistic, T: T = (sum over group i)(sum over outcome j) nij * aj * yi The aj are scores; here 1, 2, 3, but there can be other choices for the scores (I’ll get to this later). Under the null hypothesis there is no association between group and outcome, so we can consider the permutation (i.e., randomization) distribution of T. That is, we fix the margins of the table, just as we do for Fisher’s exact test, and then consider all the possible permutatio ns that give these same marginal counts. Under this null hypothesis permutation distribution, it is easy to see that the mean of T is E(T) = N * a_bar * y_bar where a_bar is the weighted average of aj (using the marginal counts n +j): a_bar = (sum over j) n+j * aj / N Similarly, y_bar is a weighted average of yi. The variance of T, under the permutation distribution, is (exactly) V(T) = (N - 1) * Sa 2 * Sy 2 where Sa 2 is the standard deviation squared for aj: Sa 2 = (1/(N-1)) * (sum over j) n+j * (aj - a_bar)2 We can compute a chi-squared statistic: Qs = (T - E(T))2 / V(T) If you look at the formula for Qs, you see something interesting. It is simply Qs = (N - 1) * ray 2
where ray is Pearsons correlation coefficient for a and y Just Pearson's correlation This "test of trend"is nothing more than Pearsons correlation coefficient. Let' s try this list Iy a weight 1.|0 2.|02 3.|03 4.|11 21 corry a [fw=weight] (obs=14) y a 1.0000 0.17771.0000 return list
where ray is Pearson’s correlation coefficient for a and y. Just Pearson’s correlation This “test of trend” is nothing more than Pearson’s correlation coefficient. Let’s try this. . list +----------------+ | y a weight | |----------------| 1. | 0 1 19 | 2. | 0 2 31 | 3. | 0 3 67 | 4. | 1 1 1 | 5. | 1 2 5 | |----------------| 6. | 1 3 21 | +----------------+ . corr y a [fw=weight] (obs=144) | y a -------------+------------------ y | 1.0000 a | 0.1777 1.0000 . return list
scalars r(rho)=.1776868721791401 display (r(N-1)*r(rho)2 4.5148853 PROC FREQ gave chi-squared =4.515 Royston's ptrend and the Cochran-Armitage test Let's now use Patrick Royston's ptrend command [note: Patrick posted his ptrend command on Statalist]. The data must look like the following for this command list a nl 1.|1191 2.|2315 36721 ptrend n1 n2 a prop 19 0.950 0.861 0.761 Trend analysis for proportions
scalars: r(N) = 144 r(rho) = .1776868721791401 . display (r(N)-1)*r(rho)^2 4.5148853 PROC FREQ gave chi-squared = 4.515. Royston’s ptrend and the Cochran–Armitage test Let’s now use Patrick Royston’s ptrend command [note: Patrick posted his ptrend command on Statalist]. The data must look like the following for this command: . list +-------------+ | a n1 n2 | |-------------| 1. | 1 19 1 | 2. | 2 31 5 | 3. | 3 67 21 | +-------------+ . ptrend n1 n2 a n1 n2 _prop a 1. 19 1 0.950 1 2. 31 5 0.861 2 3. 67 21 0.761 3 Trend analysis for proportions
Regression of p nl/(n1+n2)on a Slope=-.09553,std. error=.0448,Z=2.132 Overall chi2(2) 4.551,pr>chi2=0.1027 Chi2 (1)for trend 4.546,pr>chi2=0.0330 Chi2(1) for departure 0.004, pr>chi2 =0. 9467 The"Chi2(1)for trend" is slightly different It's 4.546 rather than 4.515 Well, ptrend is just using N rather thann -1 in the formula Qtrend Chi2(1) for trend =N*r Let's go back to data arranged for the corr computation and show thi quietly corr y a [fw=weight] display r(N)*r(rho)2 4.5464579 slope=0 is given by the Qtrend statistic Well, we all know the relationship between Pearson,s correlation and regression-this is all this is Qdeparture("Chi2 (1)for departure"as Royston's output nicely labels it) is the statistic for the Cochran-Armitage test. But Trend and Dep arture are usually performed at the same time, so lumping them together under the name"Cochran-Armitage"is sometimes loosely done. The null hypothesis for the Cochran-Armitage test is that the trend is linear, and the test is for"departures"from linearity; i.e., it's simply a goodness -of-fit test for the linear model Qs(or equivalently Qtrend ) tests the null hypothesis of no associ ation. Since it's just a Pearson,s correlation we know that it's powerful ag ainst alternative hypotheses of monotonic trend but it's not at all powerful against curvilinear (or other) associations with a0 linear component. Model it Rich Goldstein recommended log istic regression Regression is certainly a better context to understand what you are doing -rather than all these chi-squared tests that are simply Pearson s correlations or goodness -of-fit
Regression of p = n1/(n1+n2) on a: Slope = -.09553, std. error = .0448, Z = 2.132 Overall chi2(2) = 4.551, pr>chi2 = 0.1027 Chi2(1) for trend = 4.546, pr>chi2 = 0.0330 Chi2(1) for departure = 0.004, pr>chi2 = 0.9467 The “Chi2(1) for trend” is slightly different. It’s 4.546 rather than 4.515. Well, ptrend is just using N rather than N − 1 in the formula: Qtrend = Chi2(1) for trend = N * ray 2 Let’s go back to data arranged for the corr computation and show this. . quietly corr y a [fw=weight] . display r(N)*r(rho)^2 4.5464579 Qtrend is just Pearson’s correlation again. A regression is performed here to compute the slope, and the test of slope = 0 is given by the Qtrend statistic. Well, we all know the relationship between Pearson’s correlation and regression—this is all this is. Qdeparture (="Chi2(1) for departure" as Royston’s output nicely labels it) is the statistic for the Cochran–Armitage test. But Qtrend and Qdeparture are usually performed at the same time, so lumping them together under the name “Cochran–Armitage” is sometimes loosely done. The null hypothesis for the Cochran–Armitage test is that the trend is linear, and the test is for “departures” from linearity; i.e., it’s simply a goodness-of-fit test for the linear model. Qs (or equivalently Qtrend) tests the null hypothesis of no association. Since it’s just a Pearson’s correlation, we know that it’s powerful against alternative hypotheses of monotonic trend, but it’s not at all powerful against curvilinear (or other) associations with a 0 linear component. Model it Rich Goldstein recommended logistic regression. Regression is certainly a better context to understand what you are doing—rather than all these chi-squared tests that are simply Pearson’s correlations or goodness -of-fit