9.7 Globally Convergent Methods for Nonlinear Systems of Equations 383 F.they often succeed where a direct attack via Newton's method alone fails.The next section deals with these methods. CITED REFERENCES AND FURTHER READING: Acton,F.S.1970,Numerica/Methods That Work;1990,corrected edition (Washington:Mathe- matical Association of America),Chapter 14.[1] Ostrowski,A.M.1966,Solutions of Equations and Systems of Equations,2nd ed.(New York: Academic Press). Ortega,J.,and Rheinboldt,W.1970,Iterative Solution of Nonlinear Equations in Several Vari- ables (New York:Academic Press). 9.7 Globally Convergent Methods for Nonlinear Systems of Equations 品三 2 We have seen that Newton's method for solving nonlinear equations has an unfortunate tendency to wander off into the wild blue yonder if the initial guess is not sufficiently close to the root.A global method is one that converges to a solution Press. from almost any starting point.In this section we will develop an algorithm that combines the rapid local convergence of Newton's method with a globally convergent strategy that will guarantee some progress towards the solution at each iteration. 兰a The algorithm is closely related to the quasi-Newton method of minimization which we will describe in 810.7. Recall our discussion of 89.6:the Newton step for the set of equations 6 F(x)=0 (9.7.1) IS xaew=Xold十x (9.7.2) where 6x=-J-1.F (9.7.3) Numerica 10.621 Here J is the Jacobian matrix.How do we decide whether to accept the Newton step 43106 6x?A reasonable strategy is to require that the step decrease F2=F.F.This is the same requirement we would impose if we were trying to minimize f.F (9.7.4) (Theis for later convenience.)Every solution to (9.7.1)minimizes(9.7.4),but there may be local minima of(9.7.4)that are not solutions to (9.7.1).Thus,as already mentioned,simply applying one of our minimum finding algorithms from Chapter 10 to (9.7.4)is not a good idea. To develop a better strategy,note that the Newton step (9.7.3)is a descent direction for f: 7f.6x=(F.J)·(-J-1.F)=-F.F<0 (9.7.5)
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 383 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). F, they often succeed where a direct attack via Newton’s method alone fails. The next section deals with these methods. CITED REFERENCES AND FURTHER READING: Acton, F.S. 1970, Numerical Methods That Work; 1990, corrected edition (Washington: Mathematical Association of America), Chapter 14. [1] Ostrowski, A.M. 1966, Solutions of Equations and Systems of Equations, 2nd ed. (New York: Academic Press). Ortega, J., and Rheinboldt, W. 1970, Iterative Solution of Nonlinear Equations in Several Variables (New York: Academic Press). 9.7 Globally Convergent Methods for Nonlinear Systems of Equations We have seen that Newton’s method for solving nonlinear equations has an unfortunate tendency to wander off into the wild blue yonder if the initial guess is not sufficiently close to the root. A global method is one that converges to a solution from almost any starting point. In this section we will develop an algorithm that combines the rapid local convergence of Newton’s method with a globally convergent strategy that will guarantee some progress towards the solution at each iteration. The algorithm is closely related to the quasi-Newton method of minimization which we will describe in §10.7. Recall our discussion of §9.6: the Newton step for the set of equations F(x)=0 (9.7.1) is xnew = xold + δx (9.7.2) where δx = −J−1 · F (9.7.3) Here J is the Jacobian matrix. How do we decide whether to accept the Newton step δx? A reasonable strategy is to require that the step decrease |F| 2 = F · F. This is the same requirement we would impose if we were trying to minimize f = 1 2 F · F (9.7.4) (The 1 2 is for later convenience.) Every solution to (9.7.1) minimizes (9.7.4), but there may be local minima of (9.7.4) that are not solutions to (9.7.1). Thus, as already mentioned, simply applying one of our minimum finding algorithms from Chapter 10 to (9.7.4) is not a good idea. To develop a better strategy, note that the Newton step (9.7.3) is a descent direction for f: ∇f · δx = (F · J) · (−J−1 · F) = −F · F < 0 (9.7.5)
384 Chapter 9.Root Finding and Nonlinear Sets of Equations Thus our strategy is quite simple:We always first try the full Newton step, because once we are close enough to the solution we will get quadratic convergence. However,we check at each iteration that the proposed step reduces f.If not,we backtrack along the Newton direction until we have an acceptable step.Because the Newton step is a descent direction for f,we are guaranteed to find an acceptable step by backtracking.We will discuss the backtracking algorithm in more detail below. Note that this method essentially minimizes f by taking Newton steps designed to bring F to zero.This is not equivalent to minimizing f directly by taking Newton steps designed to bring Vf to zero.While the method can still occasionally fail by landing on a local minimum of f,this is quite rare in practice.The routine newt 81 below will warn you if this happens.The remedy is to try a new starting point. Line Searches and Backtracking When we are not close enough to the minimum of f,taking the full Newton step p =6x need not decrease the function;we may move too far for the quadratic approximation to be valid.All we are guaranteed is that initially f decreases as we move in the Newton direction. RECIPES I So the goal is to move to a new point xnew along the direction of the Newton step p,but 令 not necessarily all the way: xnew=xold+入p,0<入≤1 (9.7.6) Press. The aim is to find A so that f(xo+Ap)has decreased sufficiently.Until the early 1970s, standard practice was to choose Aso that xew exactly minimizes f in the direction p.However, we now know that it is extremely wasteful of function evaluations to do so.A better strategy is as follows:Since p is always the Newton direction in our algorithms,we first try A =1,the full Newton step.This will lead to quadratic convergence when x is sufficiently close to the solution.However,if f(x)does not meet our acceptance criteria,we backtrack along the SCIENTIFIC Newton direction,trying a smaller value of A,until we find a suitable point.Since the Newton direction is a descent direction,we are guaranteed to decrease f for sufficiently small A. 6 What should the criterion for accepting a step be?It is not sufficient to require merely that f(xew)<f(xold).This criterion can fail to converge to a minimum of f in one of two ways.First,it is possible to construct a sequence of steps satisfying this criterion with f decreasing too slowly relative to the step lengths.Second,one can have a sequence where the step lengths are too small relative to the initial rate of decrease of f.(For examples of such sequences,see [1],p.117.) 10621 A simple way to fix the first problem is to require the average rate of decrease of f to Numerica be at least some fraction o of the initial rate of decrease Vf.p: 43106 f(new)≤f(old)+aVf·(Kaew-xold) (9.7.7) Here the parameter a satisfies 0<a<1.We can get away with quite small values of o;10-4 is a good choice. The second problem can be fixed by requiring the rate of decrease of f at xew to be Software. greater than some fraction B of the rate of decrease of f at xold.In practice,we will not need to impose this second constraint because our backtracking algorithm will have a built-in cutoff to avoid taking steps that are too small. Here is the strategy for a practical backtracking routine:Define g(入)三f(xold+入p) (9.7.8) so that g(入)=7f.p (9.7.9) If we need to backtrack,then we model g with the most current information we have and choose A to minimize the model.We start with g(0)and g(0)available.The first step is
384 Chapter 9. Root Finding and Nonlinear Sets of Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Thus our strategy is quite simple: We always first try the full Newton step, because once we are close enough to the solution we will get quadratic convergence. However, we check at each iteration that the proposed step reduces f. If not, we backtrack along the Newton direction until we have an acceptable step. Because the Newton step is a descent direction for f, we are guaranteed to find an acceptable step by backtracking. We will discuss the backtracking algorithm in more detail below. Note that this method essentially minimizes f by taking Newton steps designed to bring F to zero. This is not equivalent to minimizing f directly by taking Newton steps designed to bring ∇f to zero. While the method can still occasionally fail by landing on a local minimum of f, this is quite rare in practice. The routine newt below will warn you if this happens. The remedy is to try a new starting point. Line Searches and Backtracking When we are not close enough to the minimum of f, taking the full Newton step p = δx need not decrease the function; we may move too far for the quadratic approximation to be valid. All we are guaranteed is that initially f decreases as we move in the Newton direction. So the goal is to move to a new point xnew along the direction of the Newton step p, but not necessarily all the way: xnew = xold + λp, 0 < λ ≤ 1 (9.7.6) The aim is to find λ so that f(xold + λp) has decreased sufficiently. Until the early 1970s, standard practice was to choose λ so that xnew exactly minimizes f in the direction p. However, we now know that it is extremely wasteful of function evaluations to do so. A better strategy is as follows: Since p is always the Newton direction in our algorithms, we first try λ = 1, the full Newton step. This will lead to quadratic convergence when x is sufficiently close to the solution. However, if f(xnew) does not meet our acceptance criteria, we backtrack along the Newton direction, trying a smaller value of λ, until we find a suitable point. Since the Newton direction is a descent direction, we are guaranteed to decrease f for sufficiently small λ. What should the criterion for accepting a step be? It is not sufficient to require merely that f(xnew) < f(xold). This criterion can fail to converge to a minimum of f in one of two ways. First, it is possible to construct a sequence of steps satisfying this criterion with f decreasing too slowly relative to the step lengths. Second, one can have a sequence where the step lengths are too small relative to the initial rate of decrease of f. (For examples of such sequences, see [1], p. 117.) A simple way to fix the first problem is to require the average rate of decrease of f to be at least some fraction α of the initial rate of decrease ∇f · p: f(xnew) ≤ f(xold) + α∇f · (xnew − xold) (9.7.7) Here the parameter α satisfies 0 <α< 1. We can get away with quite small values of α; α = 10−4 is a good choice. The second problem can be fixed by requiring the rate of decrease of f at xnew to be greater than some fraction β of the rate of decrease of f at xold. In practice, we will not need to impose this second constraint because our backtracking algorithm will have a built-in cutoff to avoid taking steps that are too small. Here is the strategy for a practical backtracking routine: Define g(λ) ≡ f(xold + λp) (9.7.8) so that g (λ) = ∇f · p (9.7.9) If we need to backtrack, then we model g with the most current information we have and choose λ to minimize the model. We start with g(0) and g (0) available. The first step is
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 385 always the Newton step,A=1.If this step is not acceptable,we have available g(1)as well. We can therefore model g()as a quadratic: g(A)≈[g(1)-g(0)-g'(0]A2+g'(O)入+g(0) (9.7.10) Taking the derivative of this quadratic,we find that it is a minimum when 入 g'(0) 2[g(1)-g(0)-g(0] (9.7.11) Since the Newton step failed,we can show that for small a.We need to guard against too small a value of入,however.We set入min=O.l. On second and subsequent backtracks,we model g as a cubic in A,using the previous 81 value g(1)and the second most recent value g(2): g(A)=aA3+bA2+g(0)入+g(0) (9.7.12) nted for Requiring this expression to give the correct values of g at A and A2 gives two equations that can be solved for the coefficients a and b: [=[] a 1/八足-1/A3 「g(A1)-g(0)A1-g(0)1 (9.7.13) 19(A2)-g(0)A2-g(0)」 The minimum of the cubic (9.7.12)is at λ= -b+V62-3ag'(0) (9.7.14) 3a America Press. We enforce that入lie between入max=0.5入1 and Amin=0.lAi. The routine has two additional features,a minimum step length alamin and a maximum step length stpmax.Insrch will also be used in the quasi-Newton minimization routine Programs dfpmin in the next section. SCIENTIFIC #include #include "nrutil.h" #define ALF 1.0e-4 Ensures sufficient decrease in function value to dir #define TOLX 1.0e-7 Convergence criterion on Ax. void Insrch(int n,float xold[],float fold,float g[],float p[],float x[], float *f,float stpmax,int *check,float (*func)(float []) 1920 COMPUTING(ISBN Given an n-dimensional point xold[1..n],the value of the function and gradient there,fold 包 and g[1..n],and a direction p[1..n],finds a new point x[1..n]along the direction p from xold where the function func has decreased "sufficiently."The new function value is returned in f.stpmax is an input quantity that limits the length of the steps so that you do not try to evaluate the function in regions where it is undefined or subject to overflow.p is usually the Newton direction.The output quantity check is false(0)on a normal exit.It is true (1)when idge.org Numerical Recipes 10-621 -43108 x is too close to xold.In a minimization algorithm,this usually signals convergence and can be ignored.However,in a zero-finding algorithm the calling program should check whether the convergence is spurious.Some "difficult"problems may require double precision in this routine. (outside int i: North Software. float a,alam,alam2,alamin,b,disc,f2,rhs1,rhs2,slope,sum,temp, test,tmplam; *check=0; for(sum=0.0,1=1;10.0)nrerror("Roundoff problem in Insrch."); test=0.0; Compute入min for(i=1;1<n;i+)[
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 385 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). always the Newton step, λ = 1. If this step is not acceptable, we have available g(1) as well. We can therefore model g(λ) as a quadratic: g(λ) ≈ [g(1) − g(0) − g (0)]λ2 + g (0)λ + g(0) (9.7.10) Taking the derivative of this quadratic, we find that it is a minimum when λ = − g (0) 2[g(1) − g(0) − g (0)] (9.7.11) Since the Newton step failed, we can show that λ #include "nrutil.h" #define ALF 1.0e-4 Ensures sufficient decrease in function value. #define TOLX 1.0e-7 Convergence criterion on ∆x. void lnsrch(int n, float xold[], float fold, float g[], float p[], float x[], float *f, float stpmax, int *check, float (*func)(float [])) Given an n-dimensional point xold[1..n], the value of the function and gradient there, fold and g[1..n], and a direction p[1..n], finds a new point x[1..n] along the direction p from xold where the function func has decreased “sufficiently.” The new function value is returned in f. stpmax is an input quantity that limits the length of the steps so that you do not try to evaluate the function in regions where it is undefined or subject to overflow. p is usually the Newton direction. The output quantity check is false (0) on a normal exit. It is true (1) when x is too close to xold. In a minimization algorithm, this usually signals convergence and can be ignored. However, in a zero-finding algorithm the calling program should check whether the convergence is spurious. Some “difficult” problems may require double precision in this routine. { int i; float a,alam,alam2,alamin,b,disc,f2,rhs1,rhs2,slope,sum,temp, test,tmplam; *check=0; for (sum=0.0,i=1;i stpmax) for (i=1;i= 0.0) nrerror("Roundoff problem in lnsrch."); test=0.0; Compute λmin. for (i=1;i<=n;i++) {
386 Chapter 9.Root Finding and Nonlinear Sets of Equations temp=fabs(p[i])/FMAX(fabs(xold[i]),1.0); if (temp test)test=temp; alamin=TOLX/test; alam=1.0: Always try full Newton step first. for(;;){ Start of iteration loop. for (i=1;i0.5*alam) tmplam=0.5*alam; 入≤0.5λ1 Programs alam2=alam: for their f2=*f; alam=FMAX(tmplam,0.1*alam); λ20.1入1- Try again. to dir Here now is the globally convergent Newton routine newt that uses Insrch.A feature 1788-1982 SCIENTIFIC COMPUTING(ISBN of newt is that you need not supply the Jacobian matrix analytically;the routine will attempt to compute the necessary partial derivatives of F by finite differences in the routine fdjac.This routine uses some of the techniques described in $5.7 for computing numerical derivatives.Of course,you can always replace fdjac with a routine that calculates the Jacobian analytically 10-621 if this is easy for you to do. #include Numerical Recipes -43108 #include "nrutil.h" #define MAXITS 200 #define TOLF 1.0e-4 (outside #define TOLMIN 1.0e-6 Software. #define TOLX 1.0e-7 North #define STPMX 100.0 Here MAXITS is the maximum number of iterations;TOLF sets the convergence criterion on function values:TOLMIN sets the criterion for deciding whether spurious convergence to a minimum of fmin has occurred;TOLX is the convergence criterion on ox;STPMX is the scaled maximum step length allowed in line searches. int nni Global variables to communicate with fmin. float *fvec; void (*nrfuncv)(int n,float v[],float f[]); #define FREERETURN {free_vector(fvec,1,n);free_vector(xold,1,n);\ free_vector(p,1,n);free_vector(g,1,n);free_matrix(fjac,1,n,1,n);\ free_ivector(indx,1,n);return;
386 Chapter 9. Root Finding and Nonlinear Sets of Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). temp=fabs(p[i])/FMAX(fabs(xold[i]),1.0); if (temp > test) test=temp; } alamin=TOLX/test; alam=1.0; Always try full Newton step first. for (;;) { Start of iteration loop. for (i=1;i 0.5*alam) tmplam=0.5*alam; λ ≤ 0.5λ1. } } alam2=alam; f2 = *f; alam=FMAX(tmplam,0.1*alam); λ ≥ 0.1λ1. } Try again. } Here now is the globally convergent Newton routine newt that uses lnsrch. A feature of newt is that you need not supply the Jacobian matrix analytically; the routine will attempt to compute the necessary partial derivatives of F by finite differences in the routine fdjac. This routine uses some of the techniques described in §5.7 for computing numerical derivatives. Of course, you can always replace fdjac with a routine that calculates the Jacobian analytically if this is easy for you to do. #include #include "nrutil.h" #define MAXITS 200 #define TOLF 1.0e-4 #define TOLMIN 1.0e-6 #define TOLX 1.0e-7 #define STPMX 100.0 Here MAXITS is the maximum number of iterations; TOLF sets the convergence criterion on function values; TOLMIN sets the criterion for deciding whether spurious convergence to a minimum of fmin has occurred; TOLX is the convergence criterion on δx; STPMX is the scaled maximum step length allowed in line searches. int nn; Global variables to communicate with fmin. float *fvec; void (*nrfuncv)(int n, float v[], float f[]); #define FREERETURN {free_vector(fvec,1,n);free_vector(xold,1,n);\ free_vector(p,1,n);free_vector(g,1,n);free_matrix(fjac,1,n,1,n);\ free_ivector(indx,1,n);return;}
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 387 void newt(float x[],int n,int *check, void(*vecfunc)(int,float☐,float0)) Given an initial guess x [1..n]for a root in n dimensions,find the root by a globally convergent Newton's method.The vector of functions to be zeroed,called fvec [1..n]in the routine below,is returned by the user-supplied routine vecfunc(n,x,fvec).The output quantity check is false (0)on a normal return and true (1)if the routine has converged to a local minimum of the function fmin defined below.In this case try restarting from a different initial guess. void fdjac(intn,f1oatx☐,float fvec☐,float**df, void (*vecfunc)(int,float []float [])) f1 oat fmin(f1oatx☐); void Insrch(int n,float xold[],float fold,float g[],float p[],float x[], float *f,float stpmax,int *check,float (*func)(float [])) 8: void lubksb(float **a,int n,int *indx,float b); void ludcmp(float **a,int n,int *indx,float *d); nted for 1881992 int i,its,j,*indx; float d,den,f,fold,stpmax,sum,temp,test,**fjac,*g,*p,*xold; 1-800 indx=ivector(1,n); fjac=matrix(1,n,1,n); NUMERICAL RECIPES g=vector(1,n); p=vector(1,n): server 可 xold=vector(1,n); 2 fvec=vector(1,n); Define global variables. (Nort nn=n; nrfuncv=vecfunc; America computer, Press. THE f=fmin(x); fvec is also computed by this call. ART test=0.0; Test for initial guess being a root.Use for(i=1;1test)test=fabs(fvec[i]); Programs if (test 0.01*TOLF){ 米check=0: FREERETURN for(sum=0.0,1=1;1test)test=fabs(fvec[i]); if (test TOLF){ *check=0; FREERETURN if (*check){ Check for gradient of f zero,i.e.spuri- test=0.0: ous convergence. den=FMAX(f,0.5*n); for(i=1;i<=n;i++){ temp=fabs(g[i])*FMAX(fabs(x[i]),1.0)/den;
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 387 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). void newt(float x[], int n, int *check, void (*vecfunc)(int, float [], float [])) Given an initial guess x[1..n] for a root in n dimensions, find the root by a globally convergent Newton’s method. The vector of functions to be zeroed, called fvec[1..n] in the routine below, is returned by the user-supplied routine vecfunc(n,x,fvec). The output quantity check is false (0) on a normal return and true (1) if the routine has converged to a local minimum of the function fmin defined below. In this case try restarting from a different initial guess. { void fdjac(int n, float x[], float fvec[], float **df, void (*vecfunc)(int, float [], float [])); float fmin(float x[]); void lnsrch(int n, float xold[], float fold, float g[], float p[], float x[], float *f, float stpmax, int *check, float (*func)(float [])); void lubksb(float **a, int n, int *indx, float b[]); void ludcmp(float **a, int n, int *indx, float *d); int i,its,j,*indx; float d,den,f,fold,stpmax,sum,temp,test,**fjac,*g,*p,*xold; indx=ivector(1,n); fjac=matrix(1,n,1,n); g=vector(1,n); p=vector(1,n); xold=vector(1,n); fvec=vector(1,n); Define global variables. nn=n; nrfuncv=vecfunc; f=fmin(x); fvec is also computed by this call. test=0.0; Test for initial guess being a root. Use for (i=1;i test) test=fabs(fvec[i]); if (test test) test=fabs(fvec[i]); if (test < TOLF) { *check=0; FREERETURN } if (*check) { Check for gradient of f zero, i.e., spuritest=0.0; ous convergence. den=FMAX(f,0.5*n); for (i=1;i<=n;i++) { temp=fabs(g[i])*FMAX(fabs(x[i]),1.0)/den;
388 Chapter 9.Root Finding and Nonlinear Sets of Equations if (temp test)test-temp; *check=(test TOLMIN 1 0); FREERETURN test=0.0; Test for convergence on 6x. for(i=1;1 #include "nrutil.h" 11-600 #define EPS 1.0e-4 Approximate square root of the machine precision. void fdjac(int n,float x[,float fvec[],float **df, /Cambridge void(*vecfunc)(int,float☐,f1oat☐)) from NUMERICAL RECIPES IN 19881992 Computes forward-difference approximation to Jacobian.On input,x[1..n]is the point at which the Jacobian is to be evaluated,fvec[1..n]is the vector of function values at the point,and vecfunc(n,x,f)is a user-supplied routine that returns the vector of functions at (Nort server users to make x.On output,df [1..n][1..n]is the Jacobian array. Ameri computer, one paper UnN电.t THE int i,j; ART float h,temp,*f; 是 f=vector(1,n); for (j=1;j<=n;j++) temp=x[j]; h=EPS*fabs(temp); st st Programs 1f(h=0.0)h=EPS: x[j]=temp+h; Trick to reduce finite precision error. h=x[j]-temp; to dir (*vecfunc)(n,x,f); x[j]=temp; rectcustser 18881920 OF SCIENTIFIC COMPUTING(ISBN for (i=1;i<=n;i++)df[i][j]=(f[i]-fvec[i])/h; Forward difference for- mula. free_vector(f,1,n); v@cam 10-621 #include "nrutil.h" idge.org .Further reproduction, Numerical Recipes -43108 extern int nn; extern float *fvec; (outside extern void (*nrfuncv)(int n,float v[],float f[]); Software. float fmin(float x[]) Returns f=F.F at x.The global pointer *nrfuncv points to a routine that returns the Ame vector of functions at x.It is set to point to a user-supplied routine in the calling program. Global variables also communicate the function values back to the calling program. 1nt1: float sum; (*nrfuncv)(nn,x,fvec); for (sum=0.0,i=1;i<=nn;i++)sum +SQR(fvec [i]); return 0.5*sum;
388 Chapter 9. Root Finding and Nonlinear Sets of Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). if (temp > test) test=temp; } *check=(test test) test=temp; } if (test #include "nrutil.h" #define EPS 1.0e-4 Approximate square root of the machine precision. void fdjac(int n, float x[], float fvec[], float **df, void (*vecfunc)(int, float [], float [])) Computes forward-difference approximation to Jacobian. On input, x[1..n] is the point at which the Jacobian is to be evaluated, fvec[1..n] is the vector of function values at the point, and vecfunc(n,x,f) is a user-supplied routine that returns the vector of functions at x. On output, df[1..n][1..n] is the Jacobian array. { int i,j; float h,temp,*f; f=vector(1,n); for (j=1;j<=n;j++) { temp=x[j]; h=EPS*fabs(temp); if (h == 0.0) h=EPS; x[j]=temp+h; Trick to reduce finite precision error. h=x[j]-temp; (*vecfunc)(n,x,f); x[j]=temp; for (i=1;i<=n;i++) df[i][j]=(f[i]-fvec[i])/h; Forward difference for- } mula. free_vector(f,1,n); } #include "nrutil.h" extern int nn; extern float *fvec; extern void (*nrfuncv)(int n, float v[], float f[]); float fmin(float x[]) Returns f = 1 2 F · F at x. The global pointer *nrfuncv points to a routine that returns the vector of functions at x. It is set to point to a user-supplied routine in the calling program. Global variables also communicate the function values back to the calling program. { int i; float sum; (*nrfuncv)(nn,x,fvec); for (sum=0.0,i=1;i<=nn;i++) sum += SQR(fvec[i]); return 0.5*sum; }
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 389 The routine newt assumes that typical values of all components of x and of F are of order unity,and it can fail if this assumption is badly violated.You should rescale the variables by their typical values before invoking newt if this problem occurs. Multidimensional Secant Methods:Broyden's Method Newton's method as implemented above is quite powerful,but it still has several disadvantages. One drawback is that the Jacobian matrix is needed.In many problems analytic derivatives are unavailable.If function evaluation is expensive,then the cost of finite-difference determination of the Jacobian can be prohibitive. Just as the quasi-Newton methods to be discussed in $10.7 provide cheap approximations for the Hessian matrix in minimization algorithms,there are quasi-Newton methods that provide cheap approximations to the Jacobian for zero finding.These methods are often called secant methods,since they reduce to the secant method (89.2)in one dimension (see,e.g.,[1]). The best of these methods still seems to be the first one introduced,Broyden's method[2]. Let us denote the approximate Jacobian by B.Then the ith quasi-Newton step x ICAL is the solution of B:·6x:=-F (9.7.15) where x;=x+1-x(cf.equation 9.7.3).The quasi-Newton or secant condition is that B+1 satisfy 0A7 B:+1·6x1=6F: (9.7.16) where 6F:=F:+1-F:.This is the generalization of the one-dimensional secant approxima- tion to the derivative,6F/6x.However,equation (9.7.16)does not determine B+1 uniquely 9 in more than one dimension. Many different auxiliary conditions to pin down B+1 have been explored,but the best-performing algorithm in practice results from Broyden's formula.This formula is based on the idea of getting B+by making the least change to B.consistent with the secant equation (9.7.16).Broyden showed that the resulting formula is B+1=B,+(6E,-B·x)© (9.7.17) 6xi·dxi You can easily check that B;+satisfies (9.7.16). Early implementations of Broyden's method used the Sherman-Morrison formula, equation (2.7.2),to invert equation (9.7.17)analytically, B=B1+-B·,)8B Numer (9.7.18) 6x1·B.6F -43126 Then instead of solving equation(9.7.3)by e.g.,LU decomposition,one determined 6x:=-B,1·F (9.7.19) by matrix multiplication in O(N2)operations.The disadvantage of this method is that it cannot easily be embedded in a globally convergent strategy,for which the gradient of equation(9.7.4)requires B,not B-1, V(F.F)≥BT.F (9.7.20) Accordingly,we implement the update formula in the form (9.7.17). However,we can still preserve the O(N)solution of(9.7.3)by using QR decomposition (82.10)instead of LU decomposition.The reason is that because ofthe special form ofequation (9.7.17),the QR decomposition of B:can be updated into the QR decomposition of B+in O(N2)operations (82.10).All we need is an initial approximation Bo to start the ball rolling. It is often acceptable to start simply with the identity matrix,and then allow O(N)updates to produce a reasonable approximation to the Jacobian.We prefer to spend the first N function evaluations on a finite-difference approximation to initialize B via a call to fdjac
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 389 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). The routine newt assumes that typical values of all components of x and of F are of order unity, and it can fail if this assumption is badly violated. You should rescale the variables by their typical values before invoking newt if this problem occurs. Multidimensional Secant Methods: Broyden’s Method Newton’s method as implemented above is quite powerful, but it still has several disadvantages. One drawback is that the Jacobian matrix is needed. In many problems analytic derivatives are unavailable. If function evaluation is expensive, then the cost of finite-difference determination of the Jacobian can be prohibitive. Just as the quasi-Newton methods to be discussed in §10.7 provide cheap approximations for the Hessian matrix in minimization algorithms, there are quasi-Newton methods that provide cheap approximations to the Jacobian for zero finding. These methods are often called secant methods, since they reduce to the secant method (§9.2) in one dimension (see, e.g., [1]). The best of these methods still seems to be the first one introduced, Broyden’s method [2]. Let us denote the approximate Jacobian by B. Then the ith quasi-Newton step δxi is the solution of Bi · δxi = −Fi (9.7.15) where δxi = xi+1 − xi (cf. equation 9.7.3). The quasi-Newton or secant condition is that Bi+1 satisfy Bi+1 · δxi = δFi (9.7.16) where δFi = Fi+1 − Fi. This is the generalization of the one-dimensional secant approximation to the derivative, δF/δx. However, equation (9.7.16) does not determine Bi+1 uniquely in more than one dimension. Many different auxiliary conditions to pin down Bi+1 have been explored, but the best-performing algorithm in practice results from Broyden’s formula. This formula is based on the idea of getting Bi+1 by making the least change to Bi consistent with the secant equation (9.7.16). Broyden showed that the resulting formula is Bi+1 = Bi + (δFi − Bi · δxi) ⊗ δxi δxi · δxi (9.7.17) You can easily check that Bi+1 satisfies (9.7.16). Early implementations of Broyden’s method used the Sherman-Morrison formula, equation (2.7.2), to invert equation (9.7.17) analytically, B−1 i+1 = B−1 i + (δxi − B−1 i · δFi) ⊗ δxi · B−1 i δxi · B−1 i · δFi (9.7.18) Then instead of solving equation (9.7.3) by e.g., LU decomposition, one determined δxi = −B−1 i · Fi (9.7.19) by matrix multiplication in O(N2) operations. The disadvantage of this method is that it cannot easily be embedded in a globally convergent strategy, for which the gradient of equation (9.7.4) requires B, not B−1, ∇( 1 2 F · F) BT · F (9.7.20) Accordingly, we implement the update formula in the form (9.7.17). However, we can still preserve the O(N2)solution of (9.7.3) by using QR decomposition (§2.10) instead ofLU decomposition. The reason is that because of the special form of equation (9.7.17), the QR decomposition of Bi can be updated into the QR decomposition of Bi+1 in O(N2) operations (§2.10). All we need is an initial approximation B0 to start the ball rolling. It is often acceptable to start simply with the identity matrix, and then allow O(N) updates to produce a reasonable approximation to the Jacobian. We prefer to spend the first N function evaluations on a finite-difference approximation to initialize B via a call to fdjac
390 Chapter 9.Root Finding and Nonlinear Sets of Equations Since B is not the exact Jacobian,we are not guaranteed that &x is a descent direction for f =F.F(cf.equation 9.7.5).Thus the line search algorithm can fail to return a suitable step if B wanders far from the true Jacobian.In this case,we reinitialize B by another call to fdjac Like the secant method in one dimension,Broyden's method converges superlinearly once you get close enough to the root.Embedded in a global strategy,it is almost as robust as Newton's method,and often needs far fewer function evaluations to determine a zero. Note that the final value of B is not always close to the true Jacobian at the root,even when the method converges. The routine broydn given below is very similar to newt in organization.The principal differences are the use of QR decomposition instead of LU,and the updating formula instead of directly determining the Jacobian.The remarks at the end of newt about scaling the variables apply equally to broydn. #include granted for #include "nrutil.h" #define MAXITS 200 1-800 #define EPS 1.0e-7 #define TOLF 1.0e-4 #define TOLX EPS from NUMERICAL RECIPESI 18881892 #define STPMX 100.0 #define TOLMIN 1.0e-6 Here MAXITS is the maximum number of iterations;EPS is a number close to the machine (Nort server precision:TOLF is the convergence criterion on function values;TOLX is the convergence criterion on 6x;STPMX is the scaled maximum step length allowed in line searches;TOLMIN is used to decide whether spurious convergence to a minimum of fmin has occurred. computer Americ Press. #define FREERETURN {free_vector(fvec,1,n);free_vector(xold,1,n);\ ART free_vector(w,1,n);free_vector(t,1,n);free_vector(s,1,n);\ free_matrix(r,1,n,1,n);free_matrix(qt,1,n,1,n);free_vector(p,1,n);\ 9 Programs free_vector(g,1,n);free_vector(fvcold,1,n);free_vector(d,1,n);\ free_vector(c,1,n);return;} int nn; Global variables to communicate with fmin. float *fvec; void (*nrfuncv)(int n,float v[],float f[]); to dir void broydn(float x[],int n,int *check, void(*vecfunc)(int,f1oat☐,f1oat☐)) SCIENTIFIC COMPUTING(ISBN Given an initial guess x[1..n]for a root in n dimensions,find the root by Broyden's method 19881992 embedded in a globally convergent strategy.The vector of functions to be zeroed,called fvec [1..n]in the routine below,is returned by the user-supplied routine vecfunc(n,x,fvec) The routine fdjac and the function fmin from newt are used.The output quantity check is false (0)on a normal return and true (1)if the routine has converged to a local minimum of the function fmin or if Broyden's method can make no further progress.In this case try Numerical Recipes 10-621 -43108 restarting from a different initial guess. f void fdjac(intn,float x[☐,f1 oat fvec[☐,float**df void(*vecfunc)(int,f1oat[],f1oat□)); (outside f1 oat fmin(f1oatx☐); Software. void Insrch(int n,float xold[],float fold,float g[],float p[],float x[], North float *f,float stpmax,int *check,float (*func)(float [])) void grdcmp(float **a,int n,float *c,float *d,int *sing); Ame void qrupdt(f1oat**r,f1oat**qt,intn,f1oatu☐,f1oatv☐); void rsolv(float *a,int n,float d],float b[]); int i,its,j,k,restrt,sing,skip; float den,f,fold,stpmax,sum,temp,test,*c,*d,*fvcold; float *g,*p,**qt,**r,*s,*t,*w,*xold; c=vector(1,n); d=vector(1,n); fvcold=vector(1,n); g=vector(1,n); p=vector(1,n);
390 Chapter 9. Root Finding and Nonlinear Sets of Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Since B is not the exact Jacobian, we are not guaranteed that δx is a descent direction for f = 1 2 F·F (cf. equation 9.7.5). Thus the line search algorithm can fail to return a suitable step if B wanders far from the true Jacobian. In this case, we reinitialize B by another call to fdjac. Like the secant method in one dimension, Broyden’s method converges superlinearly once you get close enough to the root. Embedded in a global strategy, it is almost as robust as Newton’s method, and often needs far fewer function evaluations to determine a zero. Note that the final value of B is not always close to the true Jacobian at the root, even when the method converges. The routine broydn given below is very similar to newt in organization. The principal differences are the use of QR decomposition instead of LU, and the updating formula instead of directly determining the Jacobian. The remarks at the end of newt about scaling the variables apply equally to broydn. #include #include "nrutil.h" #define MAXITS 200 #define EPS 1.0e-7 #define TOLF 1.0e-4 #define TOLX EPS #define STPMX 100.0 #define TOLMIN 1.0e-6 Here MAXITS is the maximum number of iterations; EPS is a number close to the machine precision; TOLF is the convergence criterion on function values; TOLX is the convergence criterion on δx; STPMX is the scaled maximum step length allowed in line searches; TOLMIN is used to decide whether spurious convergence to a minimum of fmin has occurred. #define FREERETURN {free_vector(fvec,1,n);free_vector(xold,1,n);\ free_vector(w,1,n);free_vector(t,1,n);free_vector(s,1,n);\ free_matrix(r,1,n,1,n);free_matrix(qt,1,n,1,n);free_vector(p,1,n);\ free_vector(g,1,n);free_vector(fvcold,1,n);free_vector(d,1,n);\ free_vector(c,1,n);return;} int nn; Global variables to communicate with fmin. float *fvec; void (*nrfuncv)(int n, float v[], float f[]); void broydn(float x[], int n, int *check, void (*vecfunc)(int, float [], float [])) Given an initial guess x[1..n] for a root in n dimensions, find the root by Broyden’s method embedded in a globally convergent strategy. The vector of functions to be zeroed, called fvec[1..n] in the routine below, is returned by the user-supplied routine vecfunc(n,x,fvec). The routine fdjac and the function fmin from newt are used. The output quantity check is false (0) on a normal return and true (1) if the routine has converged to a local minimum of the function fmin or if Broyden’s method can make no further progress. In this case try restarting from a different initial guess. { void fdjac(int n, float x[], float fvec[], float **df, void (*vecfunc)(int, float [], float [])); float fmin(float x[]); void lnsrch(int n, float xold[], float fold, float g[], float p[], float x[], float *f, float stpmax, int *check, float (*func)(float [])); void qrdcmp(float **a, int n, float *c, float *d, int *sing); void qrupdt(float **r, float **qt, int n, float u[], float v[]); void rsolv(float **a, int n, float d[], float b[]); int i,its,j,k,restrt,sing,skip; float den,f,fold,stpmax,sum,temp,test,*c,*d,*fvcold; float *g,*p,**qt,**r,*s,*t,*w,*xold; c=vector(1,n); d=vector(1,n); fvcold=vector(1,n); g=vector(1,n); p=vector(1,n);
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 391 qt=matrix(1,n,1,n); r=matrix(1,n,1.n); s=vector(1,n); t=vector(1,n); w=vector(1,n); xold=vector(1,n); fvec=vector(1,n); Define global variables. nn=n; nrfuncv-vecfunc; f=fmin(x); The vector fvec is also computed by this test=0.0; call. for(i=1:1test)test=fabs(fvec[i]); stringent test than sim- if (test 0.01*TOLF) ply TOLF】 *check=0; 8 includi FREERETURN 鱼 19881992 for(sum=0.0,1=1;1=EPS*(fabs(fvec[i])+fabs(fvcold[i])))skip=0; Don't update with noisy components of w. e1sew[1]=0.0; if (!skip){ for(i=1;i<n;1+){ t=QT.w for (sum=0.0,j=1;j<=n;j++)sum +qt[i][j]*w[j]; t[i]=sum;
9.7 Globally Convergent Methods for Nonlinear Systems of Equations 391 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). qt=matrix(1,n,1,n); r=matrix(1,n,1,n); s=vector(1,n); t=vector(1,n); w=vector(1,n); xold=vector(1,n); fvec=vector(1,n); Define global variables. nn=n; nrfuncv=vecfunc; f=fmin(x); The vector fvec is also computed by this test=0.0; call. for (i=1;i test)test=fabs(fvec[i]); if (test = EPS*(fabs(fvec[i])+fabs(fvcold[i]))) skip=0; Don’t update with noisy components of w. else w[i]=0.0; } if (!skip) { for (i=1;i<=n;i++) { t = QT · w. for (sum=0.0,j=1;j<=n;j++) sum += qt[i][j]*w[j]; t[i]=sum; }
392 Chapter 9. Root Finding and Nonlinear Sets of Equations for (den=0.0,i=1;i=1;i--){ Computef(Q.R)T.F for the line search. for(sum=0.0,j=1:jtest) test=fabs(fvec[i]); one paper 是 ART if (test TOLF){ *check=0; FREERETURN Programs if (*check){ True if line search failed to find a new x. if (restrt)FREERETURN Failure:already tried reinitializing the Jaco- strictly prohibited else bian. test=0.0; Check for gradient of f zero,i.e. spurious to dir den=FMAX(f,0.5*n); convergence. for(i=1;1<=n;i+){ temp=fabs(g[i])*FMAX(fabs(x [i]),1.0)/den; OF SCIENTIFIC COMPUTING(ISBN if (temp test)test=temp; if (test TOLMIN)FREERETURN else restrt=1; Try reinitializing the Jacobian 10-621 else Successful step:will use Broyden update for restrt=0; next step. Numerical Recipes books or .Further reproduction, 1988-1992 by Numerical Recipes test=0.0; Test for convergence on x. 431085 for(i=1;i<=n;1++)C temp=(fabs(x [i]-xold[i]))/FMAX(fabs(x[i]),1.0); if (temp test)test=temp; (outside North Amer Software. if (test TOLX)FREERETURN nrerror("MAXITS exceeded in broydn"); FREERETURN
392 Chapter 9. Root Finding and Nonlinear Sets of Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). for (den=0.0,i=1;i=1;i--) { Compute ∇f ≈ (Q · R)T · F for the line search. for (sum=0.0,j=1;j test) test=fabs(fvec[i]); if (test test) test=temp; } if (test test) test=temp; } if (test < TOLX) FREERETURN } } nrerror("MAXITS exceeded in broydn"); FREERETURN }