15.2 Fitting Data to a Straight Line 661 as a distribution can be.Almost always,the cause of too good a chi-square fit is that the experimenter,in a"fit"of conservativism,has overestimated his or her measurement errors.Very rarely,too good a chi-square signals actual fraud,data that has been "fudged"to fit the model. A rule of thumb is that a"typical"value of x2 for a"moderately"good fit is X2v.More precise is the statement that the x2 statistic has a meanvand a standard deviation v2v,and,asymptotically for large v,becomes normally distributed. In some cases the uncertainties associated with a set of measurements are not known in advance,and considerations related to x-fitting are used to derive a value for o.If we assume that all measurements have the same standard deviation,=o, and that the model does fit well,then we can proceed by first assigning an arbitrary constant to all points,next fitting for the model parameters by minimizing2 and finally recomputing 突冷季 N 2 =贴-y(a2/N-M0 (15.1.6 =1 是足之d Obviously,this approach prohibits an independent assessment of goodness-of-fit,a 9 fact occasionally missed by its adherents.When.however.the measurement error is not known,this approach at least allows some kind of error bar to be assigned ress. to the points. If we take the derivative of equation(15.1.5)with respect to the parameters ak, we obtain equations that must hold at the chi-square minimum. SCIENTIFIC 0 dy(xi;...ak... k=1,.,M ak (15.1.7) 6 Equation(15.1.7)is,in general,a set of M nonlinear equations for the M unknown ak Various of the procedures described subsequently in this chapter derive from (15.1.7)and its specializations. Numerical Recipes 10.621 CITED REFERENCES AND FURTHER READING: Bevington,P.R.1969,Data Reduction and Error Analysis for the Physical Sciences (New York: 43106 McGraw-Hill),Chapters 1-4. von Mises,R.1964,Mathematical Theory of Probability and Statistics (New York:Academic Press),VI.C.[1] (outside North Software. 15.2 Fitting Data to a Straight Line A concrete example will make the considerations of the previous section more meaningful.We consider the problem of fitting a set of N data points(xi,yi)to a straight-line model y(x)=y(x;a,b)=a+bx (15.2.1)
15.2 Fitting Data to a Straight Line 661 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). as a distribution can be. Almost always, the cause of too good a chi-square fit is that the experimenter, in a “fit” of conservativism, has overestimated his or her measurement errors. Very rarely, too good a chi-square signals actual fraud, data that has been “fudged” to fit the model. A rule of thumb is that a “typical” value of χ2 for a “moderately” good fit is χ2 ≈ ν. More precise is the statement that the χ2 statistic has a mean ν and a standard deviation √2ν, and, asymptotically for large ν, becomes normally distributed. In some cases the uncertainties associated with a set of measurements are not known in advance, and considerations related to χ2 fitting are used to derive a value for σ. If we assume that all measurements have the same standard deviation, σi = σ, and that the model does fit well, then we can proceed by first assigning an arbitrary constant σ to all points, next fitting for the model parameters by minimizing χ2, and finally recomputing σ2 = N i=1 [yi − y(xi)]2/(N − M) (15.1.6) Obviously, this approach prohibits an independent assessment of goodness-of-fit, a fact occasionally missed by its adherents. When, however, the measurement error is not known, this approach at least allows some kind of error bar to be assigned to the points. If we take the derivative of equation (15.1.5) with respect to the parameters a k, we obtain equations that must hold at the chi-square minimum, 0 = N i=1 yi − y(xi) σ2 i ∂y(xi; ...ak ...) ∂ak k = 1,...,M (15.1.7) Equation (15.1.7) is, in general, a set of M nonlinear equations for the M unknown ak. Various of the procedures described subsequently in this chapter derive from (15.1.7) and its specializations. CITED REFERENCES AND FURTHER READING: Bevington, P.R. 1969, Data Reduction and Error Analysis for the Physical Sciences (New York: McGraw-Hill), Chapters 1–4. von Mises, R. 1964, Mathematical Theory of Probability and Statistics (New York: Academic Press), §VI.C. [1] 15.2 Fitting Data to a Straight Line A concrete example will make the considerations of the previous section more meaningful. We consider the problem of fitting a set of N data points (xi, yi) to a straight-line model y(x) = y(x; a, b) = a + bx (15.2.1)
662 Chapter 15.Modeling of Data This problem is often called linear regression,a terminology that originated,long ago,in the social sciences.We assume that the uncertaintyo:associated with each measurement yi is known,and that the xi's(values of the dependent variable) are known exactly. To measure how well the model agrees with the data,we use the chi-square merit function (15.1.5).which in this case is x2(a,b) Yi-a-bxi (15.2.2) i1 If the measurement errors are normally distributed,then this merit function will give maximum likelihood parameter estimations of a and b;if the errors are not normally 菲 distributed,then the estimations are not maximum likelihood,but may still be useful in a practical sense.In $15.7,we will treat the case where outlier points are so numerous as to render the x2merit function useless. Equation (15.2.2)is minimized to determine a and b.At its minimum, derivatives of x-(a,b)with respect to a,b vanish. RECIPES 2 0x2 N 0 =-2%-a-bm -1 0 Press. N (15.2.3) b=-2(i-a一b边 0= 0x2 9 Programs =1 These conditions can be rewritten in a convenient form if we define the following IENTIFIC sums: 6 N w Se八1 N Sz三 Sg三厂班 (15.2.4) 1920 COMPUTING (ISBN Sy三 Numerical 10521 With these definitions (15.2.3)becomes Recipes 43108 as+bSx=Sy (outside (15.2.5) aSt +bSz=Szy North Software. The solution of these two equations in two unknowns is calculated as 冠 △=SSx-(S)2 a=SiESy-SsSsu (15.2.6) b= SSzy-SiSu Equation (15.2.6)gives the solution for the best-fit model parameters a and b
662 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). This problem is often called linear regression, a terminology that originated, long ago, in the social sciences. We assume that the uncertainty σi associated with each measurement yi is known, and that the xi’s (values of the dependent variable) are known exactly. To measure how well the model agrees with the data, we use the chi-square merit function (15.1.5), which in this case is χ2(a, b) = N i=1 yi − a − bxi σi 2 (15.2.2) If the measurement errors are normally distributed, then this merit function will give maximum likelihood parameter estimations of a and b; if the errors are not normally distributed, then the estimations are not maximum likelihood, but may still be useful in a practical sense. In §15.7, we will treat the case where outlier points are so numerous as to render the χ2 merit function useless. Equation (15.2.2) is minimized to determine a and b. At its minimum, derivatives of χ2(a, b) with respect to a, b vanish. 0 = ∂χ2 ∂a = −2 N i=1 yi − a − bxi σ2 i 0 = ∂χ2 ∂b = −2 N i=1 xi(yi − a − bxi) σ2 i (15.2.3) These conditions can be rewritten in a convenient form if we define the following sums: S ≡ N i=1 1 σ2 i Sx ≡ N i=1 xi σ2 i Sy ≡ N i=1 yi σ2 i Sxx ≡ N i=1 x2 i σ2 i Sxy ≡ N i=1 xiyi σ2 i (15.2.4) With these definitions (15.2.3) becomes aS + bSx = Sy aSx + bSxx = Sxy (15.2.5) The solution of these two equations in two unknowns is calculated as ∆ ≡ SSxx − (Sx) 2 a = SxxSy − SxSxy ∆ b = SSxy − SxSy ∆ (15.2.6) Equation (15.2.6) gives the solution for the best-fit model parameters a and b.
15.2 Fitting Data to a Straight Line 663 We are not done,however.We must estimate the probable uncertainties in the estimates of a and b,since obviously the measurement errors in the data must introduce some uncertainty in the determination of those parameters.If the data are independent,then each contributes its own bit of uncertainty to the parameters. Consideration of propagation of errors shows that the varianceo in the value of any function will be -立( af (15.2.7) 1=1 For the straight line,the derivatives of a and b with respect to yi can be directly evaluated from the solution: Sza-SxTi ICAL Oyi σ△ Sxi-S (15.2.8) ob RECIPES Oyi △ 令 Summing over the points as in (15.2.7),we get o2=Sx/△ (15.2.9) 后=S/△ 9多 9 which are the variances in the estimates of a and b,respectively.We will see in $15.6 that an additional number is also needed to characterize properly the probable uncertainty of the parameter estimation.That number is the covariance of a and b, and (as we will see below)is given by Cov(a,b)=-Sz/△ (15.2.10) 8今5。分 6 The coefficient of correlation between the uncertainty in a and the uncertainty Numerica 10.621 in b,which is a number between-1 and 1,follows from (15.2.10)(compare equation 14.5.1), 43126 -Sx Tab=- SS (15.2.11) (outside North A positive value of rab indicates that the errors in a and b are likely to have the same sign,while a negative value indicates the errors are anticorrelated,likely to have opposite signs. We are still not done.We must estimate the goodness-of-fit of the data to the model.Absent this estimate,we have not the slightest indication that the parameters a and b in the model have any meaning at all!The probability that a value of chi-square as poor as the value(15.2.2)should occur by chance is N-2 x2 2,2 (15.2.12)
15.2 Fitting Data to a Straight Line 663 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). We are not done, however. We must estimate the probable uncertainties in the estimates of a and b, since obviously the measurement errors in the data must introduce some uncertainty in the determination of those parameters. If the data are independent, then each contributes its own bit of uncertainty to the parameters. Consideration of propagation of errors shows that the variance σ 2 f in the value of any function will be σ2 f = N i=1 σ2 i ∂f ∂yi 2 (15.2.7) For the straight line, the derivatives of a and b with respect to yi can be directly evaluated from the solution: ∂a ∂yi = Sxx − Sxxi σ2 i ∆ ∂b ∂yi = Sxi − Sx σ2 i ∆ (15.2.8) Summing over the points as in (15.2.7), we get σ2 a = Sxx/∆ σ2 b = S/∆ (15.2.9) which are the variances in the estimates of a and b, respectively. We will see in §15.6 that an additional number is also needed to characterize properly the probable uncertainty of the parameter estimation. That number is the covariance of a and b, and (as we will see below) is given by Cov(a, b) = −Sx/∆ (15.2.10) The coefficient of correlation between the uncertainty in a and the uncertainty in b, which is a number between −1 and 1, follows from (15.2.10) (compare equation 14.5.1), rab = −Sx √SSxx (15.2.11) A positive value of rab indicates that the errors in a and b are likely to have the same sign, while a negative value indicates the errors are anticorrelated, likely to have opposite signs. We are still not done. We must estimate the goodness-of-fit of the data to the model. Absent this estimate, we have not the slightest indication that the parameters a and b in the model have any meaning at all! The probability Q that a value of chi-square as poor as the value (15.2.2) should occur by chance is Q = gammqN − 2 2 , χ2 2 (15.2.12)
664 Chapter 15.Modeling of Data Here gammq is our routine for the incomplete gamma function Q(a,x),$6.2.If Q is larger than,say,0.1,then the goodness-of-fit is believable.If it is larger than,say,0.001,then the fit may be acceptable if the errors are nonnormal or have been moderately underestimated.If Q is less than 0.001 then the model and/or estimation procedure can rightly be called into question.In this latter case,turn to $15.7 to proceed further. If you do not know the individual measurement errors of the points oi,and are proceeding(dangerously)to use equation(15.1.6)for estimating these errors,then here is the procedure for estimating the probable uncertainties of the parameters a and b:Set oi=1 in all equations through (15.2.6),and multiply oa and b,as obtained from equation(15.2.9),by the additional factor Vx2/(N-2),where x2 is computed by (15.2.2)using the fitted parameters a and b.As discussed above, this procedure is equivalent to assuming a good fit,so you get no independent goodness-of-fit probability Q. In $14.5 we promised a relation between the linear correlation coefficient % r (equation 14.5.1)and a goodness-of-fit measure,x2 (equation 15.2.2).For unweighted data (alloi=1),that relation is 2 x2 =(1-r2)NVar (...UN) (15.2.13) where N NVar(1.…gw)=(:-2 (15.2.14) i=1 For data with varying weights oi,the above equations remain valid if the sums in SCIENTIFIC( equation (14.5.1)are weighted by 1/2. 6 The following function,fit,carries out exactly the operations that we have discussed.When the weights o are known in advance.the calculations exactly correspond to the formulas above.However,when weights o are unavailable, the routine assumes equal values of for each point and assumes a good fit,as discussed in $15.1. Numerical Recipes 10621 The formulas (15.2.6)are susceptible to roundoff error.Accordingly,we rewrite them as follows:Define 43106 i=1,2,..,N (15.2.15) and (15.2.16) Then,as you can verify by direct substitution, b= (15.2.17) a= Sy-Sxb (15.2.18)
664 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Here gammq is our routine for the incomplete gamma function Q(a, x), §6.2. If Q is larger than, say, 0.1, then the goodness-of-fit is believable. If it is larger than, say, 0.001, then the fit may be acceptable if the errors are nonnormal or have been moderately underestimated. If Q is less than 0.001 then the model and/or estimation procedure can rightly be called into question. In this latter case, turn to §15.7 to proceed further. If you do not know the individual measurement errors of the points σ i, and are proceeding (dangerously) to use equation (15.1.6) for estimating these errors, then here is the procedure for estimating the probable uncertainties of the parameters a and b: Set σi ≡ 1 in all equations through (15.2.6), and multiply σ a and σb, as obtained from equation (15.2.9), by the additional factor χ2/(N − 2), where χ2 is computed by (15.2.2) using the fitted parameters a and b. As discussed above, this procedure is equivalent to assuming a good fit, so you get no independent goodness-of-fit probability Q. In §14.5 we promised a relation between the linear correlation coefficient r (equation 14.5.1) and a goodness-of-fit measure, χ2 (equation 15.2.2). For unweighted data (all σi = 1), that relation is χ2 = (1 − r2)NVar(y1 ...yN ) (15.2.13) where NVar(y1 ...yN ) ≡ N i=1 (yi − y) 2 (15.2.14) For data with varying weights σi, the above equations remain valid if the sums in equation (14.5.1) are weighted by 1/σ2 i . The following function, fit, carries out exactly the operations that we have discussed. When the weights σ are known in advance, the calculations exactly correspond to the formulas above. However, when weights σ are unavailable, the routine assumes equal values of σ for each point and assumes a good fit, as discussed in §15.1. The formulas (15.2.6) are susceptible to roundoff error. Accordingly, we rewrite them as follows: Define ti = 1 σi xi − Sx S , i = 1, 2,...,N (15.2.15) and Stt = N i=1 t 2 i (15.2.16) Then, as you can verify by direct substitution, b = 1 Stt N i=1 tiyi σi (15.2.17) a = Sy − Sxb S (15.2.18)
15.2 Fitting Data to a Straight Line 665 =5(1+ SSu (15.2.19) = 1 Su (15.2.20) Cov(a,6)=- Su (15.2.21) Cov(a,b) Tab (15.2.22) GaOb #include #include "nrutil.h" void fit(float x[],float y[],int ndata,float sig[],int mwt,float *a 100 float *b,float *siga,float *sigb,float *chi2,float *q) Given a set of data points x[1..ndata],y[1..ndata]with individual standard deviations sig[1..ndata],fit them to a straight line y a bx by minimizing x2.Returned are Cambridge from NUMERICAL RECIPES IN 18881892 a,b and their respective probable uncertainties siga and sigb,the chi-square chi2,and the goodness-of-fit probability q(that the fit would have x2 this large or larger).If mwt=0 on input,then the standard deviations are assumed to be unavailable:q is returned as 1.0 and (Nor the normalization of chi2 is to unit standard deviation on all points. f America server computer, float gammg(float a,float x); to make one paper UnN电.t THE int i; 是 ART float wt,t,sxoss,sx=0.0,sy=0.0,st2=0.0,ss,sigdat; *b=0.0: if (mwt){ Accumulate sums .. ss=0.0; strictly prohibited. Programs for (i=1;i<=ndata;i++){ ..with weights t=1.0/s0R(s1g[1]): Copyright (C) ss +wt; sx +x[i]*ut: sy +y[i]*wt; nail to directcustsen else for (i=1;i<=ndata;i++){ ...or without weights. sx+=x[1]; sy+=y[1]; 2 ss=ndata; @cambridge.org 1988-1992 by Numerical Recipes OF SCIENTIFIC COMPUTING(ISBN 0-521- -431085 sxoss=sx/ss; if (mwt){ for (i=1;i<=ndata;i++){ t=(x[i]-sxoss)/sig[i]; Software. st2 +t*t; *b+=t*y[i]/s1g[1]; (outside North America) else for (is1;i<=ndata;i++){ t=x[i]-sxoss; st2 +t*t; *b+=t*y[1]; *b/=st2; Solve for a,b,a and ob *a=(sy-sx*(*b))/ss; *siga=sqrt((1.0+sx*sx/(ss*st2))/ss); *sigb=sqrt(1.0/st2);
15.2 Fitting Data to a Straight Line 665 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). σ2 a = 1 S 1 + S2 x SStt (15.2.19) σ2 b = 1 Stt (15.2.20) Cov(a, b) = − Sx SStt (15.2.21) rab = Cov(a, b) σaσb (15.2.22) #include #include "nrutil.h" void fit(float x[], float y[], int ndata, float sig[], int mwt, float *a, float *b, float *siga, float *sigb, float *chi2, float *q) Given a set of data points x[1..ndata],y[1..ndata] with individual standard deviations sig[1..ndata], fit them to a straight line y = a + bx by minimizing χ2. Returned are a,b and their respective probable uncertainties siga and sigb, the chi-square chi2, and the goodness-of-fit probability q (that the fit would have χ2 this large or larger). If mwt=0 on input, then the standard deviations are assumed to be unavailable: q is returned as 1.0 and the normalization of chi2 is to unit standard deviation on all points. { float gammq(float a, float x); int i; float wt,t,sxoss,sx=0.0,sy=0.0,st2=0.0,ss,sigdat; *b=0.0; if (mwt) { Accumulate sums ... ss=0.0; for (i=1;i<=ndata;i++) { ...with weights wt=1.0/SQR(sig[i]); ss += wt; sx += x[i]*wt; sy += y[i]*wt; } } else { for (i=1;i<=ndata;i++) { ...or without weights. sx += x[i]; sy += y[i]; } ss=ndata; } sxoss=sx/ss; if (mwt) { for (i=1;i<=ndata;i++) { t=(x[i]-sxoss)/sig[i]; st2 += t*t; *b += t*y[i]/sig[i]; } } else { for (i=1;i<=ndata;i++) { t=x[i]-sxoss; st2 += t*t; *b += t*y[i]; } } *b /= st2; Solve for a, b, σa, and σb. *a=(sy-sx*(*b))/ss; *siga=sqrt((1.0+sx*sx/(ss*st2))/ss); *sigb=sqrt(1.0/st2);
666 Chapter 15. Modeling of Data *chi2=0.0; Calculate x2. *g=1.0; if (mwt --0){ for (i=1;i2)*q=gammq(0.5*(ndata-2),0.5*(*chi2)); Equation (15.2.12). 22 CITED REFERENCES AND FURTHER READING: Bevington,P.R.1969.Data Reduction and Error Analysis for the Physical Sciences (New York: McGraw-Hill),Chapter 6. RECIPES 15.3 Straight-Line Data with Errors in Both 是7。 Press. Coordinates ART 9 Programs If experimental data are subject to measurement error not only in the y's,but also in the zi's,then the task of fitting a straight-line model IENTIFIC y(x)=a+bx (15.3.1) 6 is considerably harder.It is straightforward to write down thexmerit function for this case, x2(a,)=>s-a-b2 N (15.3.2) :+经 whereoi and oyi are,respectively,the and y standard deviations for the ith point.The weighted sum of variances in the denominator of equation (15.3.2)can be understood both as the variance in the direction of the smallest x between each data point and the line with Numerical 10521 slope b,and also as the variance of the linear combination y-a-bri of two random 431086 variables xi and yi, Recipes Var(yi-a-bxi)=Var(yi)+b2Var(i)=i+b=1/wi (15.3.3) The sum of the square of N random variables,each normalized by its variance,is thus x2-distributed. Software. We want to minimize equation (15.3.2)with respect to a and b.Unfortunately,the occurrence of b in the denominator of equation (15.3.2)makes the resulting equation for the slope x2/b=0 nonlinear.However,the corresponding condition for the intercept, ox2/0a =0,is still linear and yields (15.3.4) where the wi's are defined by equation(15.3.3).A reasonable strategy,now,is to use the machinery of Chapter 10 (e.g.,the routine brent)for minimizing a general one-dimensional function to minimize with respect to b,while using equation(15.3.4)at each stage to ensure that the minimum with respect to b is also minimized with respect to a
666 Chapter 15. Modeling of Data Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). *chi2=0.0; Calculate χ2. *q=1.0; if (mwt == 0) { for (i=1;i2) *q=gammq(0.5*(ndata-2),0.5*(*chi2)); Equation (15.2.12). } } CITED REFERENCES AND FURTHER READING: Bevington, P.R. 1969, Data Reduction and Error Analysis for the Physical Sciences (New York: McGraw-Hill), Chapter 6. 15.3 Straight-Line Data with Errors in Both Coordinates If experimental data are subject to measurement error not only in the yi’s, but also in the xi’s, then the task of fitting a straight-line model y(x) = a + bx (15.3.1) is considerably harder. It is straightforward to write down the χ2 merit function for this case, χ2 (a, b) = N i=1 (yi − a − bxi) 2 σ2 y i + b2σ2 x i (15.3.2) where σx i and σy i are, respectively, the x and y standard deviations for the ith point. The weighted sum of variances in the denominator of equation (15.3.2) can be understood both as the variance in the direction of the smallest χ2 between each data point and the line with slope b, and also as the variance of the linear combination yi − a − bxi of two random variables xi and yi, Var(yi − a − bxi) = Var(yi) + b 2 Var(xi) = σ2 y i + b 2 σ2 x i ≡ 1/wi (15.3.3) The sum of the square of N random variables, each normalized by its variance, is thus χ2-distributed. We want to minimize equation (15.3.2) with respect to a and b. Unfortunately, the occurrence of b in the denominator of equation (15.3.2) makes the resulting equation for the slope ∂χ2/∂b = 0 nonlinear. However, the corresponding condition for the intercept, ∂χ2/∂a = 0, is still linear and yields a = i wi(yi − bxi) i wi (15.3.4) where the wi’s are defined by equation (15.3.3). A reasonable strategy, now, is to use the machinery of Chapter 10 (e.g., the routine brent) for minimizing a general one-dimensional function to minimize with respect to b, while using equation (15.3.4) at each stage to ensure that the minimum with respect to b is also minimized with respect to a