666 Chapter 15. Modeling of Data *chi2=0.0; Calculate x2. *g=1.0; if (mwt --0){ for (i=1;i2)*q=gammq(0.5*(ndata-2),0.5*(*chi2)); Equation (15.2.12). 22 CITED REFERENCES AND FURTHER READING: Bevington,P.R.1969.Data Reduction and Error Analysis for the Physical Sciences (New York: McGraw-Hill),Chapter 6. RECIPES 15.3 Straight-Line Data with Errors in Both 是7。 Press. Coordinates ART 9 Programs If experimental data are subject to measurement error not only in the y's,but also in the zi's,then the task of fitting a straight-line model IENTIFIC y(x)=a+bx (15.3.1) 6 is considerably harder.It is straightforward to write down thexmerit function for this case, x2(a,)=>s-a-b2 N (15.3.2) :+经 whereoi and oyi are,respectively,the and y standard deviations for the ith point.The weighted sum of variances in the denominator of equation (15.3.2)can be understood both as the variance in the direction of the smallest x between each data point and the line with Numerical 10521 slope b,and also as the variance of the linear combination y-a-bri of two random 431086 variables xi and yi, Recipes Var(yi-a-bxi)=Var(yi)+b2Var(i)=i+b=1/wi (15.3.3) The sum of the square of N random variables,each normalized by its variance,is thus x2-distributed. Software. We want to minimize equation (15.3.2)with respect to a and b.Unfortunately,the occurrence of b in the denominator of equation (15.3.2)makes the resulting equation for the slope x2/b=0 nonlinear.However,the corresponding condition for the intercept, ox2/0a =0,is still linear and yields (15.3.4) where the wi's are defined by equation(15.3.3).A reasonable strategy,now,is to use the machinery of Chapter 10 (e.g.,the routine brent)for minimizing a general one-dimensional function to minimize with respect to b,while using equation(15.3.4)at each stage to ensure that the minimum with respect to b is also minimized with respect to a
666 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). *chi2=0.0; Calculate χ2. *q=1.0; if (mwt == 0) { for (i=1;i2) *q=gammq(0.5*(ndata-2),0.5*(*chi2)); Equation (15.2.12). } } CITED REFERENCES AND FURTHER READING: Bevington, P.R. 1969, Data Reduction and Error Analysis for the Physical Sciences (New York: McGraw-Hill), Chapter 6. 15.3 Straight-Line Data with Errors in Both Coordinates If experimental data are subject to measurement error not only in the yi’s, but also in the xi’s, then the task of fitting a straight-line model y(x) = a + bx (15.3.1) is considerably harder. It is straightforward to write down the χ2 merit function for this case, χ2 (a, b) = N i=1 (yi − a − bxi) 2 σ2 y i + b2σ2 x i (15.3.2) where σx i and σy i are, respectively, the x and y standard deviations for the ith point. The weighted sum of variances in the denominator of equation (15.3.2) can be understood both as the variance in the direction of the smallest χ2 between each data point and the line with slope b, and also as the variance of the linear combination yi − a − bxi of two random variables xi and yi, Var(yi − a − bxi) = Var(yi) + b 2 Var(xi) = σ2 y i + b 2 σ2 x i ≡ 1/wi (15.3.3) The sum of the square of N random variables, each normalized by its variance, is thus χ2-distributed. We want to minimize equation (15.3.2) with respect to a and b. Unfortunately, the occurrence of b in the denominator of equation (15.3.2) makes the resulting equation for the slope ∂χ2/∂b = 0 nonlinear. However, the corresponding condition for the intercept, ∂χ2/∂a = 0, is still linear and yields a = i wi(yi − bxi) i wi (15.3.4) where the wi’s are defined by equation (15.3.3). A reasonable strategy, now, is to use the machinery of Chapter 10 (e.g., the routine brent) for minimizing a general one-dimensional function to minimize with respect to b, while using equation (15.3.4) at each stage to ensure that the minimum with respect to b is also minimized with respect to a.
15.3 Straight-Line Data with Errors in Both Coordinates 667 http://www.nr read able files 839 (including this one) granted for 19881992 △X2=1 1600 872 to any Cambridge NUMERICAL RECIPES IN Figure 15.3.1.Standard errors for the parameters a and b.The point B can be found by varying the (North America server computer to make one paper UnN电.t THE slope b while simultaneously minimizing the intercept a.This gives the standard and also the value s.The standard error a can then be found by the geometric relation2=s2+ 元 ART Because of the finite error bars on the xi's,the minimum x2 as a function of b will Programs be finite,though usually large,when b equals infinity (line of infinite slope).The angle =arctan b is thus more suitable as a parametrization of slope than bitself.The value ofx will then be periodic in 0 with period (not 2!).If any data points have very small o's but moderate or large o's,then it is also possible to have a maximum in x near zero slope, 0.In that case,there can sometimes be two x minima,one at positive slope and the to dir other at negative.Only one of these is the correct global minimum.It is therefore important to have a good starting guess for b (or )Our strategy,implemented below,is to scale the OF SCIENTIFIC COMPUTING(ISBN yi's so as to have variance equal to the ci's,then to do a conventional (as in 815.2)linear fit with weights derived from the (scaled)sumo+i.This yields a good starting guess for 1988-19920 b if the data are even plausibly related to a straight-line model Finding the standard errors o and on the parameters a and b is more complicated. 10-621 We will see in 15.6 that,in appropriate circumstances,the standard errors in a and b are the respective projections onto the a and b axes of the "confidence region boundary"where x takes on a value one greater than its minimum,Ax=1.In the linear case of 815.2,these Numerical Recipes -43108 projections follow from the Taylor series expansion ∫82x2a2+2(461 Y2Aa△b (outside △X2≈ (15.3.5) 20a2 Daob North Software. Because of the present nonlinearity in b,however,analytic formulas for the second derivatives are quite unwieldy;more important,the lowest-order term frequently gives a poor approxima- tion to Ax2.Our strategy is therefore to find the roots of x=1 numerically,by adjusting the value of the slope b away from the minimum.In the program below the general root finder zbrent is used.It may occur that there are no roots at all-for example,if all error bars are so large that all the data points are compatible with each other.It is important,therefore,to make some effort at bracketing a putative root before refining it (cf.89.1). Because a is minimized at each stage of varying b,successful numerical root-finding leads to a value of Aa that minimizes x2 for the value of Ab that gives Ax2=1.This(see Figure 15.3.1)directly gives the tangent projection of the confidence region onto the b axis, and thus It does not,however,give the tangent projection of the confidence region onto the a axis.In the figure,we have found the point labeled B;to find o we need to find the
15.3 Straight-Line Data with Errors in Both Coordinates 667 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). ∆χ2 = 1 σa A B σb 0 b a s r Figure 15.3.1. Standard errors for the parameters a and b. The point B can be found by varying the slope b while simultaneously minimizing the intercept a. This gives the standard error σb, and also the value s. The standard error σa can then be found by the geometric relation σ2 a = s2 + r2. Because of the finite error bars on the xi’s, the minimum χ2 as a function of b will be finite, though usually large, when b equals infinity (line of infinite slope). The angle θ ≡ arctan b is thus more suitable as a parametrization of slope than b itself. The value of χ2 will then be periodic in θ with period π (not 2π!). If any data points have very small σy ’s but moderate or large σx ’s, then it is also possible to have a maximum in χ2 near zero slope, θ ≈ 0. In that case, there can sometimes be two χ2 minima, one at positive slope and the other at negative. Only one of these is the correct global minimum. It is therefore important to have a good starting guess for b (or θ). Our strategy, implemented below, is to scale the yi’s so as to have variance equal to the xi’s, then to do a conventional (as in §15.2) linear fit with weights derived from the (scaled) sum σ2 y i + σ2 x i. This yields a good starting guess for b if the data are even plausibly related to a straight-line model. Finding the standard errors σa and σb on the parameters a and b is more complicated. We will see in §15.6 that, in appropriate circumstances, the standard errors in a and b are the respective projections onto the a and b axes of the “confidence region boundary” where χ2 takes on a value one greater than its minimum, ∆χ2 = 1. In the linear case of §15.2, these projections follow from the Taylor series expansion ∆χ2 ≈ 1 2 ∂2χ2 ∂a2 (∆a) 2 + ∂2χ2 ∂b2 (∆b) 2 + ∂2χ2 ∂a∂b∆a∆b (15.3.5) Because of the present nonlinearity in b, however, analytic formulas for the second derivatives are quite unwieldy; more important, the lowest-order term frequently gives a poor approximation to ∆χ2. Our strategy is therefore to find the roots of ∆χ2 = 1 numerically, by adjusting the value of the slope b away from the minimum. In the program below the general root finder zbrent is used. It may occur that there are no roots at all — for example, if all error bars are so large that all the data points are compatible with each other. It is important, therefore, to make some effort at bracketing a putative root before refining it (cf. §9.1). Because a is minimized at each stage of varying b, successful numerical root-finding leads to a value of ∆a that minimizes χ2 for the value of ∆b that gives ∆χ2 = 1. This (see Figure 15.3.1) directly gives the tangent projection of the confidence region onto the b axis, and thus σb. It does not, however, give the tangent projection of the confidence region onto the a axis. In the figure, we have found the point labeled B; to find σa we need to find the
668 Chapter 15.Modeling of Data point A.Geometry to the rescue:To the extent that the confidence region is approximated by an ellipse,then you can prove (see figure)thato=r2+s2.The value of s is known from having found the point B.The value ofr follows from equations (15.3.2)and (15.3.3) applied at the x minimum (point O in the figure),giving r2= (15.3.6) Actually,since b can go through infinity,this whole procedure makes more sense in (a,0)space than in (a,b)space.That is in fact how the following program works.Since it is conventional,however,to return standard errors for a and b,not a and 6,we finally use the relation 8 06=09/c0s20 (15.3.7) We caution that if b and its standard error are both large,so that the confidence region actually includes infinite slope,then the standard error o is not very meaningful.The function chixy is normally called only by the routine fitexy.However,if you want,you can yourselfexplore the confidence region by making repeated calls to chixy (whose argument is an angle 0,not a slope b),after a single initializing call to fitexy. % A final caution,repeated from $15.0,is that if the goodness-of-fit is not acceptable 6 (returned probability is too small),the standard errors o and o are surely not believable.In dire circumstances,you might try scaling all your z and y error bars by a constant factor until the probability is acceptable (0.5,say),to get more plausible values for oa and Press. #include ART #include "nrutil.h" #define POTN 1.571000 Programs #define BIG 1.0e30 #define PI 3.14159265 #define ACC 1.0e-3 int nn; Global variables communicate with f1oat*xx,*yy,*sx,*sy,*,aa,offs; chixy. void fitexy(f1oatx[☐,f1oaty[▣,1 nt ndat,float sigx[□,f1 oat sigy[, float *a,float *b,float *siga,float *sigb,float *chi2,float q) SCIENTIFIC COMPUTING(ISBN Straight-line fit to input data x[1..ndat]and y [1..ndat]with errors in both x and y,the re- 19881992 spective standard deviations being the input quantities sigx[1..ndat]and sigy [1..ndat]. Output quantities are a and b such that y =a bx minimizes x2,whose value is returned as chi2.The x probability is returned as q.a small value indicating a poor fit(sometimes 10-621 indicating underestimated errors).Standard errors on a and b are returned as siga and sigb. These are not meaningful if either (i)the fit is poor,or (ii)b is so large that the data are consistent with a vertical(infinite b)line.If siga and sigb are returned as BIG,then the data Numerical Recipes 43108 are consistent with all values of b. void avevar(float data[],unsigned long n,float wave,float *var); (outside float brent(float ax,float bx,float cx, North Software float (*f)(float),float tol,float *xmin); float chixy(float bang); void fit(float x[],float y[],int ndata,float sig[],int mwt, float *a,float *b,float *siga,float *sigb,float *chi2,float *q); float gammq(float a,float x); void mnbrak(float *ax,float *bx,float *cx,float *fa,float *fb, float *fc,float (*func)(float)); float zbrent(float (*func)(float),float x1,float x2,float tol); int j; float swap,amx,amn,varx,vary,ang[7],ch[7],scale,bmn,bmx,d1,d2,r2, dum1,dum2,dum3,dum4,dum5; xx=vector(1,ndat); yy=vector(1,ndat);
668 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). point A. Geometry to the rescue: To the extent that the confidence region is approximated by an ellipse, then you can prove (see figure) that σ2 a = r2 + s2. The value of s is known from having found the point B. The value of r follows from equations (15.3.2) and (15.3.3) applied at the χ2 minimum (point O in the figure), giving r2 = 1 i wi (15.3.6) Actually, since b can go through infinity, this whole procedure makes more sense in (a, θ) space than in (a, b) space. That is in fact how the following program works. Since it is conventional, however, to return standard errors for a and b, not a and θ, we finally use the relation σb = σθ/ cos2 θ (15.3.7) We caution that if b and its standard error are both large, so that the confidence region actually includes infinite slope, then the standard error σb is not very meaningful. The function chixy is normally called only by the routine fitexy. However, if you want, you can yourself explore the confidence region by making repeated calls to chixy (whose argument is an angle θ, not a slope b), after a single initializing call to fitexy. A final caution, repeated from §15.0, is that if the goodness-of-fit is not acceptable (returned probability is too small), the standard errors σa and σb are surely not believable. In dire circumstances, you might try scaling all your x and y error bars by a constant factor until the probability is acceptable (0.5, say), to get more plausible values for σa and σb. #include #include "nrutil.h" #define POTN 1.571000 #define BIG 1.0e30 #define PI 3.14159265 #define ACC 1.0e-3 int nn; Global variables communicate with float *xx,*yy,*sx,*sy,*ww,aa,offs; chixy. void fitexy(float x[], float y[], int ndat, float sigx[], float sigy[], float *a, float *b, float *siga, float *sigb, float *chi2, float *q) Straight-line fit to input data x[1..ndat] and y[1..ndat] with errors in both x and y, the respective standard deviations being the input quantities sigx[1..ndat] and sigy[1..ndat]. Output quantities are a and b such that y = a + bx minimizes χ2, whose value is returned as chi2. The χ2 probability is returned as q, a small value indicating a poor fit (sometimes indicating underestimated errors). Standard errors on a and b are returned as siga and sigb. These are not meaningful if either (i) the fit is poor, or (ii) b is so large that the data are consistent with a vertical (infinite b) line. If siga and sigb are returned as BIG, then the data are consistent with all values of b. { void avevar(float data[], unsigned long n, float *ave, float *var); float brent(float ax, float bx, float cx, float (*f)(float), float tol, float *xmin); float chixy(float bang); void fit(float x[], float y[], int ndata, float sig[], int mwt, float *a, float *b, float *siga, float *sigb, float *chi2, float *q); float gammq(float a, float x); void mnbrak(float *ax, float *bx, float *cx, float *fa, float *fb, float *fc, float (*func)(float)); float zbrent(float (*func)(float), float x1, float x2, float tol); int j; float swap,amx,amn,varx,vary,ang[7],ch[7],scale,bmn,bmx,d1,d2,r2, dum1,dum2,dum3,dum4,dum5; xx=vector(1,ndat); yy=vector(1,ndat);
15.3 Straight-Line Data with Errors in Both Coordinates 669 sx=vector(1,ndat); sy=vector(1,ndat); ww-vector(1,ndat); avevar(x,ndat,&dumi,&varx); Find the x and y variances,and scale avevar(y,ndat,&dumi,&vary); the data into the global variables scale=sqrt(varx/vary); for communication with the func- nn=ndat; tion chixy. for (j=1;joffs){ the desired roots.Note period- d1=fabs(ang[j]-(*b)); icity in slope angles. while (d1 >PI)d1 -=PI; to dir d2=PI-d1: if (ang[j]*b){ swap=d1; d1=d2; ectcustser d2=swap; if (d1 bmx)bmx=d1; if (d2 bmn)bmn=d2; 1988-1992 by Numerical Recipes OF SCIENTIFIC COMPUTING(ISBN 10-621 -43108 if (bmx BIG){ Call zbrent to find the roots. bmx=zbrent (chixy,*b,*b+bmx,ACC)-(*b); amx=aa-(a); (outside bmn=zbrent (chixy,*b,*b-bmn,ACC)-(*b); amn=aa-(*a); North Software. *sigb=sgrt (0.5*(bmx*bmx+bmn*bmn))/(scale*SQR(cos(*b))); *siga=sqrt(0.5*(amx*amx+amn*amn)+r2)/scale; Error in a has additional piece Ame else (*sigb)=(*siga)=BIG; r2. a /scale: Unscale the answers. *b=tan(*b)/scale; free_vector(ww,1,ndat); free_vector(sy,1,ndat); free_vector(sx,1,ndat); free_vector(yy,1,ndat); free_vector(xx,1,ndat);
15.3 Straight-Line Data with Errors in Both Coordinates 669 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). sx=vector(1,ndat); sy=vector(1,ndat); ww=vector(1,ndat); avevar(x,ndat,&dum1,&varx); Find the x and y variances, and scale the data into the global variables for communication with the function chixy. avevar(y,ndat,&dum1,&vary); scale=sqrt(varx/vary); nn=ndat; for (j=1;j offs) { d1=fabs(ang[j]-(*b)); while (d1 >= PI) d1 -= PI; d2=PI-d1; if (ang[j] < *b) { swap=d1; d1=d2; d2=swap; } if (d1 < bmx) bmx=d1; if (d2 < bmn) bmn=d2; } } if (bmx < BIG) { Call zbrent to find the roots. bmx=zbrent(chixy,*b,*b+bmx,ACC)-(*b); amx=aa-(*a); bmn=zbrent(chixy,*b,*b-bmn,ACC)-(*b); amn=aa-(*a); *sigb=sqrt(0.5*(bmx*bmx+bmn*bmn))/(scale*SQR(cos(*b))); *siga=sqrt(0.5*(amx*amx+amn*amn)+r2)/scale; Error in a has additional piece } else (*sigb)=(*siga)=BIG; r2. *a /= scale; Unscale the answers. *b=tan(*b)/scale; free_vector(ww,1,ndat); free_vector(sy,1,ndat); free_vector(sx,1,ndat); free_vector(yy,1,ndat); free_vector(xx,1,ndat); }
670 Chapter 15.Modeling of Data #include #include "nrutil.h" #define BIG 1.0e30 extern int nn; extern float *xx,*yy,*sx,*sy,*ww,aa,offs; float chixy(float bang) Captive function of fitexy,returns the value of(x2-offs)for the slope b=tan(bang). Scaled data and offs are communicated via the global variables. int j; float ans,avex=0.0,avey=0.0,sumw=0.0,b; b=tan(bang); 83g for (i=1;i<=mn;i++){ 19881992 ww[j]=SQR(b*sx[j])+SQR(sy[j]); sumw +(ww[j](ww[j]1.0/BIG BIG 1.0/ww[j])); 11800 avex +ww[j]*xx[]; avey +ww[j]*yy[j]; 872 Cambridge n NUMERICAL RECIPES avex /sumw; avey /sumw; aa=avey-b*avex; for (ans -offs,j=1;j<=nn;j++) ans +ww[j]*SQR(yy[j]-aa-b*xx[j]); THE return ans; (North America 州bMe se to make one paper University Press. ART 是 Be aware that the literature on the seemingly straightforward subject of this section Programs is generally confusing and sometimes plain wrong.Deming's[1]early treatment is sound. but its reliance on Taylor expansions gives inaccurate error estimates.References[2-4]are reliable,more recent,general treatments with critiques of earlier work.York [5]and Reed [6] usefully discuss the simple case of a straight line as treated here,but the latter paper has 之 some errors.corrected in 7].All this commotion has attracted the Bayesians [8-10].who have still different points of view. OF SCIENTIFIC COMPUTING(ISBN 19891892 CITED REFERENCES AND FURTHER READING: Deming,W.E.1943,Statistical Adjustment of Data (New York:Wiley),reprinted 1964(New York: Dover).[1] FuurgPgoglrion Numerica 10621 Recipes 43108 Jefferys,W.H.1980,Astronomical Journal,vol.85,pp.177-181;see also vol.95,p.1299 (1988).[2] Jefferys,W.H.1981,Astronomical Journal,vol.86,pp.149-155;see also vol.95,p.1300 (outside (1988).[3] Lybanon,M.1984,American Journal of Physics,vol.52,pp.22-26.[4] North Software. York,D.1966,Canadian Journal of Physics,vol.44,pp.1079-1086.[5] Reed,B.C.1989,American Journal of Physics,vol.57,pp.642-646;see also vol.58,p.189, and vol..58,p.1209.[6 visit website machine Reed,B.C.1992,American Journal of Physics,vol.60,pp.59-62.[7] Zellner,A.1971,An Introduction to Bayesian Inference in Econometrics (New York:Wiley); reprinted 1987(Malabar,FL:R.E.Krieger Pub.Co.).[8] Gull,S.F.1989,in Maximum Entropy and Bayesian Methods,J.Skilling,ed.(Boston:Kluwer).[9] Jaynes,E.T.1991,in Maximum-Entropy and Bayesian Methods,Proc.10th Int.Workshop, W.T.Grandy,Jr.,and L.H.Schick,eds.(Boston:Kluwer).[10] Macdonald,J.R.,and Thompson,W.J.1992,American Journal of Physics,vol.60,pp.66-73
670 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). #include #include "nrutil.h" #define BIG 1.0e30 extern int nn; extern float *xx,*yy,*sx,*sy,*ww,aa,offs; float chixy(float bang) Captive function of fitexy, returns the value of (χ2 − offs) for the slope b=tan(bang). Scaled data and offs are communicated via the global variables. { int j; float ans,avex=0.0,avey=0.0,sumw=0.0,b; b=tan(bang); for (j=1;j<=nn;j++) { ww[j] = SQR(b*sx[j])+SQR(sy[j]); sumw += (ww[j] = (ww[j] < 1.0/BIG ? BIG : 1.0/ww[j])); avex += ww[j]*xx[j]; avey += ww[j]*yy[j]; } avex /= sumw; avey /= sumw; aa=avey-b*avex; for (ans = -offs,j=1;j<=nn;j++) ans += ww[j]*SQR(yy[j]-aa-b*xx[j]); return ans; } Be aware that the literature on the seemingly straightforward subject of this section is generally confusing and sometimes plain wrong. Deming’s[1] early treatment is sound, but its reliance on Taylor expansions gives inaccurate error estimates. References[2-4] are reliable, more recent, general treatments with critiques of earlier work. York [5] and Reed [6] usefully discuss the simple case of a straight line as treated here, but the latter paper has some errors, corrected in [7]. All this commotion has attracted the Bayesians [8-10], who have still different points of view. CITED REFERENCES AND FURTHER READING: Deming, W.E. 1943, Statistical Adjustment of Data (New York: Wiley), reprinted 1964 (New York: Dover). [1] Jefferys, W.H. 1980, Astronomical Journal, vol. 85, pp. 177–181; see also vol. 95, p. 1299 (1988). [2] Jefferys, W.H. 1981, Astronomical Journal, vol. 86, pp. 149–155; see also vol. 95, p. 1300 (1988). [3] Lybanon, M. 1984, American Journal of Physics, vol. 52, pp. 22–26. [4] York, D. 1966, Canadian Journal of Physics, vol. 44, pp. 1079–1086. [5] Reed, B.C. 1989, American Journal of Physics, vol. 57, pp. 642–646; see also vol. 58, p. 189, and vol. 58, p. 1209. [6] Reed, B.C. 1992, American Journal of Physics, vol. 60, pp. 59–62. [7] Zellner, A. 1971, An Introduction to Bayesian Inference in Econometrics (New York: Wiley); reprinted 1987 (Malabar, FL: R. E. Krieger Pub. Co.). [8] Gull, S.F. 1989, in Maximum Entropy and Bayesian Methods, J. Skilling, ed. (Boston: Kluwer). [9] Jaynes, E.T. 1991, in Maximum-Entropy and Bayesian Methods, Proc. 10th Int. Workshop, W.T. Grandy, Jr., and L.H. Schick, eds. (Boston: Kluwer). [10] Macdonald, J.R., and Thompson, W.J. 1992, American Journal of Physics, vol. 60, pp. 66–73
15.4 General Linear Least Squares 671 15.4 General Linear Least Squares An immediate generalization of $15.2 is to fit a set of data points (xi,yi)to a model that is not just a linear combination of 1 and z(namely a+bz),but rather a linear combination of any M specified functions of x.For example,the functions could be 1,z2,...,M-1,in which case their general linear combination, y()=a1+a2z+a3z2+...+aMzM-1 (15.4.1) is a polynomial of degree M-1.Or,the functions could be sines and cosines,in which case their general linear combination is a harmonic series. The general form of this kind of model is M y()=>axXx(z) (15.4.2) k=1 ICAL where Xi(),...,XM()are arbitrary fixed functions of called the basis functions. Note that the functions X()can be wildly nonlinear functions of z.In this discussion"linear"refers only to the model's dependence on its parameters ak 之 9 For these linear models we generalize the discussion of the previous section by defining a merit function -∑1aXx( 73 (15.4.3) 9 =1 0 As before,o;is the measurement error (standard deviation)of the ith data point. presumed to be known.If the measurement errors are not known,they may all (as discussed at the end of $15.1)be set to the constant valueo=1. 61 Once again,we will pick as best parameters those that minimize x2.There are several different techniques available for finding this minimum.Two are particularly useful,and we will discuss both in this section.To introduce them and elucidate their relationship,we need some notation. Let A be a matrix whose N x M components are constructed from the M basis functions evaluated at the N abscissas zi,and from the N measurement errors oi,by the prescription Numerica 10621 4=) (15.4.4) 0 The matrix A is called the design matrix of the fitting problem.Notice that in general A has more rows than columns,N >M,since there must be more data points than model parameters to be solved for.(You can fit a straight line to two points,but not a very meaningful quintic!)The design matrix is shown schematically in Figure 15.4.1. Also define a vector b of length N by 6=班 (15.4.5 and denote the M vector whose components are the parameters to be fitted, a1,...,aM,by a
15.4 General Linear Least Squares 671 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). 15.4 General Linear Least Squares An immediate generalization of §15.2 is to fit a set of data points (xi, yi) to a model that is not just a linear combination of 1 and x (namely a + bx), but rather a linear combination of any M specified functions of x. For example, the functions could be 1, x, x2,...,xM−1, in which case their general linear combination, y(x) = a1 + a2x + a3x2 + ··· + aM xM−1 (15.4.1) is a polynomial of degree M − 1. Or, the functions could be sines and cosines, in which case their general linear combination is a harmonic series. The general form of this kind of model is y(x) = M k=1 akXk(x) (15.4.2) where X1(x),...,XM(x) are arbitrary fixed functions of x, called the basis functions. Note that the functions Xk(x) can be wildly nonlinear functions of x. In this discussion “linear” refers only to the model’s dependence on its parameters a k. For these linear models we generalize the discussion of the previous section by defining a merit function χ2 = N i=1 yi − M k=1 akXk(xi) σi 2 (15.4.3) As before, σi is the measurement error (standard deviation) of the ith data point, presumed to be known. If the measurement errors are not known, they may all (as discussed at the end of §15.1) be set to the constant value σ = 1. Once again, we will pick as best parameters those that minimize χ2. There are several different techniques available for finding this minimum. Two are particularly useful, and we will discuss both in this section. To introduce them and elucidate their relationship, we need some notation. Let A be a matrix whose N × M components are constructed from the M basis functions evaluated at the N abscissas xi, and from the N measurement errors σi, by the prescription Aij = Xj (xi) σi (15.4.4) The matrix A is called the design matrix of the fitting problem. Notice that in general A has more rows than columns, N ≥M, since there must be more data points than model parameters to be solved for. (You can fit a straight line to two points, but not a very meaningful quintic!) The design matrix is shown schematically in Figure 15.4.1. Also define a vector b of length N by bi = yi σi (15.4.5) and denote the M vector whose components are the parameters to be fitted, a1,...,aM, by a.