正在加载图片...
660 Chapter 15.Modeling of Data estimate of the model parameters is obtained by minimizing the quantity N y5-yx;a1..aM】 (15.1.5) Gi called the“chi-square.” To whatever extent the measurement errors actually are normally distributed,the quantity x2 is correspondingly a sum of N squares of normally distributed quantities, each normalized to unit variance.Once we have adjusted the a1...aM to minimize 三 the value of x2,the terms in the sum are not all statistically independent.For models 81 that are linear in the a's,however,it turns out that the probability distribution for different values of x2 at its minimum can nevertheless be derived analytically,and is the chi-square distribution for N-M degrees of freedom.We learned how to compute this probability function using the incomplete gamma function gammq in $6.2.In particular,equation(6.2.18)gives the probability Q that the chi-square should exceed a particular value x2 by chance,where v=N-M is the number of degrees of freedom.The quantity Q,or its complement P =1-Q,is frequently 2 tabulated in appendices to statistics books,but we generally find it easier to use gammq and compute our own values:Q=gammq (0.5v,0.5x2).It is quite common, and usually not too wrong,to assume that the chi-square distribution holds even for Press. models that are not strictly linear in the a's This computed probability gives a quantitative measure for the goodness-of-fit of the model.If O is a very small probability for some particular data set,then the apparent discrepancies are unlikely to be chance fluctuations.Much more probably either(i)the model is wrong-can be statistically rejected,or(ii)someone has lied to OF SCIENTIFIC you about the size of the measurement errors o;-they are really larger than stated. It is an important point that the chi-square probability does not directly 6 measure the credibility of the assumption that the measurement errors are normally distributed.It assumes they are.In most,but not all,cases,however,the effect of nonnormal errors is to create an abundance of outlier points.These decrease the probability Q,so that we can add another possible,though less definitive,conclusion to the above list:(iii)the measurement errors may not be normally distributed Numerical Recipes 10621 Possibility (iii)is fairly common,and also fairly benign.It is for this reason that reasonable experimenters are often rather tolerant of low probabilities Q.It is E喜 43106 not uncommon to deem acceptable on equal terms any models with,say,>0.001. This is not as sloppy as it sounds:Truly wrong models will often be rejected with vastly smaller values of 10-18,say.However,if day-in and day-out you find (outside yourself accepting models with ~10-3,you really should track down the cause. North If you happen to know the actual distribution law of your measurement errors, then you might wish to Monte Carlo simulate some data sets drawn from a particular model,cf.87.2-87.3.You can then subject these synthetic data sets to your actual fitting procedure,so as to determine both the probability distribution of the x2 statistic,and also the accuracy with which your model parameters are reproduced by the fit.We discuss this further in 815.6.The technique is very general,but it can also be very expensive. At the opposite extreme,it sometimes happens that the probability is too large, too near to 1,literally too good to be true!Nonnormal measurement errors cannot in general produce this disease,since the normal distribution is about as "compact"660 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machine￾readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). estimate of the model parameters is obtained by minimizing the quantity χ2 ≡  N i=1 yi − y(xi; a1 ...aM) σi 2 (15.1.5) called the “chi-square.” To whatever extent the measurement errors actually are normally distributed, the quantity χ2 is correspondingly a sum of N squares of normally distributed quantities, each normalized to unit variance. Once we have adjusted the a 1 ...aM to minimize the value of χ2, the terms in the sum are not all statistically independent. For models that are linear in the a’s, however, it turns out that the probability distribution for different values of χ2 at its minimum can nevertheless be derived analytically, and is the chi-square distribution for N − M degrees of freedom. We learned how to compute this probability function using the incomplete gamma function gammq in §6.2. In particular, equation (6.2.18) gives the probability Q that the chi-square should exceed a particular value χ2 by chance, where ν = N − M is the number of degrees of freedom. The quantity Q, or its complement P ≡ 1 − Q, is frequently tabulated in appendices to statistics books, but we generally find it easier to use gammq and compute our own values: Q = gammq (0.5ν, 0.5χ2). It is quite common, and usually not too wrong, to assume that the chi-square distribution holds even for models that are not strictly linear in the a’s. This computed probability gives a quantitative measure for the goodness-of-fit of the model. If Q is a very small probability for some particular data set, then the apparent discrepancies are unlikely to be chance fluctuations. Much more probably either (i) the model is wrong — can be statistically rejected, or (ii) someone has lied to you about the size of the measurement errors σi — they are really larger than stated. It is an important point that the chi-square probability Q does not directly measure the credibility of the assumption that the measurement errors are normally distributed. It assumes they are. In most, but not all, cases, however, the effect of nonnormal errors is to create an abundance of outlier points. These decrease the probability Q, so that we can add another possible, though less definitive, conclusion to the above list: (iii) the measurement errors may not be normally distributed. Possibility (iii) is fairly common, and also fairly benign. It is for this reason that reasonable experimenters are often rather tolerant of low probabilities Q. It is not uncommon to deem acceptable on equal terms any models with, say, Q > 0.001. This is not as sloppy as it sounds: Truly wrong models will often be rejected with vastly smaller values of Q, 10−18, say. However, if day-in and day-out you find yourself accepting models with Q ∼ 10−3, you really should track down the cause. If you happen to know the actual distribution law of your measurement errors, then you might wish to Monte Carlo simulate some data sets drawn from a particular model, cf. §7.2–§7.3. You can then subject these synthetic data sets to your actual fitting procedure, so as to determine both the probability distribution of the χ 2 statistic, and also the accuracy with which your model parameters are reproduced by the fit. We discuss this further in §15.6. The technique is very general, but it can also be very expensive. At the opposite extreme, it sometimes happens that the probabilityQ is too large, too near to 1, literally too good to be true! Nonnormal measurement errors cannot in general produce this disease, since the normal distribution is about as “compact
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有