14.6 Nonparametric or Rank Correlatio_中国高校课件下载中心

点击下载：《数字信号处理》教学参考资料（Numerical Recipes in C，The Art of Scientific Computing Second Edition）Chapter 14.6

正在加载图片...

14.6 Nonparametric or Rank Correlation 639 sxy +xt*yt; *r=sxy/(sqrt(sxx*syy)+TINY); *z=0.5*1og((1.0+(*r)+TINY)/(1.0-(*r)+TINY)); Fisher's z transformation. df=n-2: t=(*r)*sqrt(df/((1.0-(*r)+TINY)*(1.0+(*r)+TINY))); Equation (14.5.5). *prob=betai(0.5*df,0.5,df/(df+t*t)); Student's t probability. / *prob=erfcc(fabs((*z)*sqrt(n-1.0))/1.4142136)*/ For large n,this easier computation of prob,using the short routine erfcc,would give approx- imately the same value. 三 g CITED REFERENCES AND FURTHER READING: Dunn,O.J.,and Clark,V.A.1974,Applied Statistics:Analysis of Variance and Regression (New York:Wiley). Hoel.P.G.1971.Introduction to Mathematical Statistics,4th ed.(New York:Wiley),Chapter 7. von Mises,R.1964,Mathematical Theory of Probability and Statistics (New York:Academic Press),Chapters IX(A)and IX(B). RECIPES I Korn,G.A.,and Korn,T.M.1968,Mathematical Handbook for Scientists and Engineers,2nd ed. 令 (New York:McGraw-Hill),$19.7. Norusis,M.J.1982,SPSS Introductory Guide:Basic Statistics and Operations:and 1985,SPSS- X Advanced Statistics Guide (New York:McGraw-Hill) 14.6 Nonparametric or Rank Correlation IENTIFIC It is precisely the uncertainty in interpreting the significance of the linear 6 correlation coefficient r that leads us to the important concepts of nonparametric or rank correlation.As before,we are given N pairs of measurements(xi,y).Before, difficulties arose because we did not necessarily know the probability distribution function from which the zi's or yi's were drawn. The key concept of nonparametric correlation is this:If we replace the value Recipes Numerica 10621 of each xi by the value of its rank among all the other xi's in the sample,that is,1,2,3,...,N,then the resulting list of numbers will be drawn from a perfectly 43106 known distribution function,namely uniformly from the integers between 1 and N, Recipes inclusive.Better than uniformly.in fact,since if the ;'s are all distinct.then each integer will occur precisely once.If some of the zi's have identical values,it is conventional to assign to all these "ties"the mean of the ranks that they would have had if their values had been slightly different.This midrank will sometimes be an integer,sometimes a half-integer.In all cases the sum of all assigned ranks will be the same as the sum of the integers from 1 to N,namely N(N+1). Of course we do exactly the same procedure for the yi's,replacing each value by its rank among the other yi's in the sample. Now we are free to invent statistics for detecting correlation between uniform sets of integers between 1 and N,keeping in mind the possibility ofties in the ranks. There is,of course,some loss of information in replacing the original numbers by ranks.We could construct some rather artificial examples where a correlation could be detected parametrically (e.g.,in the linear correlation coefficient r),but could not14.6 Nonparametric or Rank Correlation 639 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). sxy += xt*yt; } *r=sxy/(sqrt(sxx*syy)+TINY); *z=0.5*log((1.0+(*r)+TINY)/(1.0-(*r)+TINY)); Fisher’s z transformation. df=n-2; t=(*r)*sqrt(df/((1.0-(*r)+TINY)*(1.0+(*r)+TINY))); Equation (14.5.5). *prob=betai(0.5*df,0.5,df/(df+t*t)); Student’s t probability. /* *prob=erfcc(fabs((*z)*sqrt(n-1.0))/1.4142136) */ For large n, this easier computation of prob, using the short routine erfcc, would give approximately the same value. } CITED REFERENCES AND FURTHER READING: Dunn, O.J., and Clark, V.A. 1974, Applied Statistics: Analysis of Variance and Regression (New York: Wiley). Hoel, P.G. 1971, Introduction to Mathematical Statistics, 4th ed. (New York: Wiley), Chapter 7. von Mises, R. 1964, Mathematical Theory of Probability and Statistics (New York: Academic Press), Chapters IX(A) and IX(B). Korn, G.A., and Korn, T.M. 1968, Mathematical Handbook for Scientists and Engineers, 2nd ed. (New York: McGraw-Hill), §19.7. Norusis, M.J. 1982, SPSS Introductory Guide: Basic Statistics and Operations; and 1985, SPSSX Advanced Statistics Guide (New York: McGraw-Hill). 14.6 Nonparametric or Rank Correlation It is precisely the uncertainty in interpreting the significance of the linear correlation coefficient r that leads us to the important concepts of nonparametric or rank correlation. As before, we are given N pairs of measurements (xi, yi). Before, difficulties arose because we did not necessarily know the probability distribution function from which the xi’s or yi’s were drawn. The key concept of nonparametric correlation is this: If we replace the value of each xi by the value of its rank among all the other xi’s in the sample, that is, 1, 2, 3,...,N, then the resulting list of numbers will be drawn from a perfectly known distribution function, namely uniformly from the integers between 1 and N, inclusive. Better than uniformly, in fact, since if the xi’s are all distinct, then each integer will occur precisely once. If some of the xi’s have identical values, it is conventional to assign to all these “ties” the mean of the ranks that they would have had if their values had been slightly different. This midrank will sometimes be an integer, sometimes a half-integer. In all cases the sum of all assigned ranks will be the same as the sum of the integers from 1 to N, namely 1 2N(N + 1). Of course we do exactly the same procedure for the y i’s, replacing each value by its rank among the other yi’s in the sample. Now we are free to invent statistics for detecting correlation between uniform sets of integers between 1 and N, keeping in mind the possibility of ties in the ranks. There is, of course, some loss of information in replacing the original numbers by ranks. We could construct some rather artificial examples where a correlation could be detected parametrically (e.g., in the linear correlation coefficient r), but could not

向下翻页>>

点击下载：《数字信号处理》教学参考资料（Numerical Recipes in C，The Art of Scientific Computing Second Edition）Chapter 14.6