正在加载图片...
15.1 Least Squares as a Maximum Likelihood Estimator 659 on mathematical statistics.)This infatuation tended to focus interest away from the fact that,for real data,the normal distribution is often rather poorly realized,if it is realized at all.We are often taught,rather casually,that,on average,measurements will fall within to of the true value 68 percent of the time,within +2o 95 percent of the time,and within +30 99.7 percent of the time.Extending this,one would expect a measurement to be off by +20o only one time out of 2 x 1088.We all know that "glitches"are much more likely than that! In some instances,the deviations from a normal distribution are easy to understand and quantify.For example,in measurements obtained by counting 三 events,the measurement errors are usually distributed as a Poisson distribution, whose cumulative probability function was already discussed in $6.2.When the number ofcounts going into one data point is large,the Poisson distribution converges towards a Gaussian.However,the convergence is not uniform when measured in fractional accuracy.The more standard deviations out on the tail of the distribution. the larger the number of counts must be before a value close to the Gaussian is realized.The sign of the effect is always the same:The Gaussian predicts that"tail" events are much less likely than they actually (by Poisson)are.This causes such events,when they occur,to skew a least-squares fit much more than they ought. Other times,the deviations from a normal distribution are not so easy to 9 understand in detail.Experimental points are occasionally just way of Perhaps the power flickered during a point's measurement,or someone kicked the apparatus, or someone wrote down a wrong number.Points like this are called outliers. They can easily turn a least-squares fit on otherwise adequate data into nonsense. Their probability of occurrence in the assumed Gaussian model is so small that the 3。 maximum likelihood estimator is willing to distort the whole curve to try to bring OF SCIENTIFIC them,mistakenly,into line. The subject of robust statistics deals with cases where the normal or Gaussian 6 model is a bad approximation,or cases where outliers are important.We will discuss robust methods briefly in $15.7.All the sections between this one and that one assume,one way or the other,a Gaussian model for the measurement errors in the data.It it quite important that you keep the limitations of that model in mind,even as you use the very useful methods that follow from assuming it. Finally,note that our discussion of measurement errors has been limited to statistical errors,the kind that will average away if we only take enough data. Numerical Recipes 10621 43108 Measurements are also susceptible to systematic errors that will not go away with any amount of averaging.For example,the calibration of a metal meter stick might depend on its temperature.If we take all our measurements at the same wrong (outside temperature,then no amount of averaging or numerical processing will correct for this unrecognized systematic error. North Software. Chi-Square Fitting We considered the chi-square statistic once before,in $14.3.Here it arises in a slightly different context. If each data point (zi,y)has its own,known standard deviation oi,then equation(15.1.3)is modified only by putting a subscript i on the symbol o.That subscript also propagates docilely into (15.1.4),so that the maximum likelihood15.1 Least Squares as a Maximum Likelihood Estimator 659 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machine￾readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). on mathematical statistics.) This infatuation tended to focus interest away from the fact that, for real data, the normal distribution is often rather poorly realized, if it is realized at all. We are often taught, rather casually, that, on average, measurements will fall within ±σ of the true value 68 percent of the time, within ±2σ 95 percent of the time, and within ±3σ 99.7 percent of the time. Extending this, one would expect a measurement to be off by ±20σ only one time out of 2 × 10 88. We all know that “glitches” are much more likely than that! In some instances, the deviations from a normal distribution are easy to understand and quantify. For example, in measurements obtained by counting events, the measurement errors are usually distributed as a Poisson distribution, whose cumulative probability function was already discussed in §6.2. When the number of counts going into one data point is large, the Poisson distribution converges towards a Gaussian. However, the convergence is not uniform when measured in fractional accuracy. The more standard deviations out on the tail of the distribution, the larger the number of counts must be before a value close to the Gaussian is realized. The sign of the effect is always the same: The Gaussian predicts that “tail” events are much less likely than they actually (by Poisson) are. This causes such events, when they occur, to skew a least-squares fit much more than they ought. Other times, the deviations from a normal distribution are not so easy to understand in detail. Experimental points are occasionally just way off. Perhaps the power flickered during a point’s measurement, or someone kicked the apparatus, or someone wrote down a wrong number. Points like this are called outliers. They can easily turn a least-squares fit on otherwise adequate data into nonsense. Their probability of occurrence in the assumed Gaussian model is so small that the maximum likelihood estimator is willing to distort the whole curve to try to bring them, mistakenly, into line. The subject of robust statistics deals with cases where the normal or Gaussian model is a bad approximation, or cases where outliers are important. We will discuss robust methods briefly in §15.7. All the sections between this one and that one assume, one way or the other, a Gaussian model for the measurement errors in the data. It it quite important that you keep the limitations of that model in mind, even as you use the very useful methods that follow from assuming it. Finally, note that our discussion of measurement errors has been limited to statistical errors, the kind that will average away if we only take enough data. Measurements are also susceptible to systematic errors that will not go away with any amount of averaging. For example, the calibration of a metal meter stick might depend on its temperature. If we take all our measurements at the same wrong temperature, then no amount of averaging or numerical processing will correct for this unrecognized systematic error. Chi-Square Fitting We considered the chi-square statistic once before, in §14.3. Here it arises in a slightly different context. If each data point (xi, yi) has its own, known standard deviation σi, then equation (15.1.3) is modified only by putting a subscript i on the symbol σ. That subscript also propagates docilely into (15.1.4), so that the maximum likelihood
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有