正在加载图片...
626 Chapter 14.Statistical Description of Data sort(n1,data1); sort(n2,data2); en1=n1; en2=n2; *d=0.0: hi1e(j1=n12&j2<=n2){ If we are not done... if ((di=data1[j1])<=(d2=data2[j2]))fn1=j1++/en1; Next step is in datal if(d2<=d1)fn2=j2++/en2; Next step is in data2 if ((dt=fabs(fn2-fn1))>*d)*d=dt; en=sqrt (en1*en2/(en1+en2)); *prob=probks((en+0.12+0.11/en)*(*d)); Compute significance. 5常 8 Both of the above routines use the following routine for calculating the function from NUMERICAL 18881992 QKS: #include <math.h> #define EPS1 0.001 #define EPS2 1.0e-8 to any server RECIPES float probks(float alam) 令 Kolmogorov-Smirnov probability function. University int j; (North America computer float a2,fac=2.0,sum=0.0,term,termbf=0.0; one paper Press. ART a2 =-2.0*alam*alam; 是 for(j=1;j<=100;j+)( Programs term=fac*exp(a2*j*j); sum +term; for thei if (fabs(term)<=EPS1*termbf II fabs(term)<=EPS2*sum)return sum; SCIENTIFIC tac=-tac Alternating signs in sum. termbf=fabs(term); return 1.0; Get here only by failing to converge MPUTING 1920 v@cam (ISBN Variants on the K-S Test 021 The sensitivity of the K-S test to deviations from a cumulative distribution function 43108 P()is not independent of In fact,the K-S test tends to be most sensitive around the Numerical Recipes median value,where P(x)=0.5,and less sensitive at the extreme ends of the distribution. where P(x)is near 0 or 1.The reason is that the difference SN()-P()does not,in the (outside null hypothesis,have a probability distribution that is independent ofz.Rather,its variance is proportional to P(x)[1-P(x)],which is largest at P=0.5.Since the K-S statistic (14.3.5) 首 Software. is the maximum difference over all x of two cumulative distribution functions,a deviation that might be statistically significant at ils own value of gets compared to the expected chance deviation at P=0.5,and is thus discounted.A result is that,while the K-S test is good at finding shifis in a probability distribution,especially changes in the median value,it is not visit website always so good at finding spreads,which more affect the tails of the probability distribution. and which may leave the median unchanged. One way of increasing the power of the K-S statistic out on the tails is to replace D (equation 14.3.5)by a so-called stabilized or weighted statistic [2-4],for example the Anderson-Darling statistic, D*=max SN(E)-P(E) (14.3.11) √VP(x1-P(x】626 Chapter 14. Statistical Description of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machine￾readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). sort(n1,data1); sort(n2,data2); en1=n1; en2=n2; *d=0.0; while (j1 <= n1 && j2 <= n2) { If we are not done... if ((d1=data1[j1]) <= (d2=data2[j2])) fn1=j1++/en1; Next step is in data1. if (d2 <= d1) fn2=j2++/en2; Next step is in data2. if ((dt=fabs(fn2-fn1)) > *d) *d=dt; } en=sqrt(en1*en2/(en1+en2)); *prob=probks((en+0.12+0.11/en)*(*d)); Compute significance. } Both of the above routines use the following routine for calculating the function QKS: #include <math.h> #define EPS1 0.001 #define EPS2 1.0e-8 float probks(float alam) Kolmogorov-Smirnov probability function. { int j; float a2,fac=2.0,sum=0.0,term,termbf=0.0; a2 = -2.0*alam*alam; for (j=1;j<=100;j++) { term=fac*exp(a2*j*j); sum += term; if (fabs(term) <= EPS1*termbf || fabs(term) <= EPS2*sum) return sum; fac = -fac; Alternating signs in sum. termbf=fabs(term); } return 1.0; Get here only by failing to converge. } Variants on the K–S Test The sensitivity of the K–S test to deviations from a cumulative distribution function P(x) is not independent of x. In fact, the K–S test tends to be most sensitive around the median value, where P(x)=0.5, and less sensitive at the extreme ends of the distribution, where P(x) is near 0 or 1. The reason is that the difference |SN (x) − P(x)| does not, in the null hypothesis, have a probability distribution that is independent of x. Rather, its variance is proportional to P(x)[1 − P(x)], which is largest at P = 0.5. Since the K–S statistic (14.3.5) is the maximum difference over all x of two cumulative distribution functions, a deviation that might be statistically significant at its own value of x gets compared to the expected chance deviation at P = 0.5, and is thus discounted. A result is that, while the K–S test is good at finding shifts in a probability distribution, especially changes in the median value, it is not always so good at finding spreads, which more affect the tails of the probability distribution, and which may leave the median unchanged. One way of increasing the power of the K–S statistic out on the tails is to replace D (equation 14.3.5) by a so-called stabilized or weighted statistic [2-4], for example the Anderson-Darling statistic, D* = max −∞<x<∞ |SN (x) − P(x)| P(x)[1 − P(x)] (14.3.11)
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有