正在加载图片...
422 Chapter 10.Minimization or Maximization of Functions form(10.6.1).Recall that,starting with an arbitrary initial vector go and letting ho =go,the conjugate gradient method constructs two sequences of vectors from the recurrence g+1=g:-入Ahih+1=g+1+Y%h:i=0,1,2, (10.6.2) The vectors satisfy the orthogonality and conjugacy conditions g·g=0h:·Ahj=0gh=0j<i (10.6.3) The scalars Ai and Yi are given by 菲 8:·g: g·h =hi:A.hi hiA.hi (10.6.4) ICAL %=8i+1g+1 (10.6.5) g:·g1 Equations(10.6.2)-(10.6.5)are simply equations(2.7.32)-(2.7.35)for a symmetric 专。 9 A in a new notation.(A self-contained derivation of these results in the context of function minimization is given by Polak [1].) Now suppose that we knew the Hessian matrix A in equation(10.6.1).Then we could use the construction(10.6.2)to find successively conjugate directions hi along which to line-minimize.After N such,we would efficiently have arrived at 9 the minimum of the quadratic form.But we don't know A. Here is a remarkable theorem to save the day:Suppose we happen to have SCIENTIFIC( gi=-Vf(Pi),for some point Pi,where f is of the form(10.6.1).Suppose that we proceed from Pi along the direction hi to the local minimum of f located at some 61 point Pi+1 and then set gi+1=-Vf(Pi+1).Then,this gi+1 is the same vector as would have been constructed by equation (10.6.2).(And we have constructed it without knowledge of A!) Proof:By equation (10.5.3),g;=-A.Pi+b,and 、雨 g+1=-A·(Pi+h)+b=gi-入A·hi (10.6.6) Numerica 10621 43106 with A chosen to take us to the line minimum.But at the line minimum hi.Vf= -hig+=0.This latter condition is easily combined with(10.6.6)to solve for A.The result is exactly the expression(10.6.4).But with this value of A,(10.6.6) is the same as (10.6.2),q.e.d. We have,then,the basis of an algorithm that requires neither knowledge of the Hessian matrix A,nor even the storage necessary to store such a matrix.A sequence of directions h;is constructed,using only line minimizations,evaluations of the gradient vector,and an auxiliary vector to store the latest in the sequence of g's. The algorithm described so far is the original Fletcher-Reeves version of the conjugate gradient algorithm.Later,Polak and Ribiere introduced one tiny,but sometimes significant,change.They proposed using the form %=8+1-8)g+ 10.6.7) g·g422 Chapter 10. Minimization or Maximization of Functions Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machine￾readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). form (10.6.1). Recall that, starting with an arbitrary initial vector g 0 and letting h0 = g0, the conjugate gradient method constructs two sequences of vectors from the recurrence gi+1 = gi − λiA · hi hi+1 = gi+1 + γihi i = 0, 1, 2,... (10.6.2) The vectors satisfy the orthogonality and conjugacy conditions gi · gj = 0 hi · A · hj = 0 gi · hj = 0 j<i (10.6.3) The scalars λi and γi are given by λi = gi · gi hi · A · hi = gi · hi hi · A · hi (10.6.4) γi = gi+1 · gi+1 gi · gi (10.6.5) Equations (10.6.2)–(10.6.5) are simply equations (2.7.32)–(2.7.35) for a symmetric A in a new notation. (A self-contained derivation of these results in the context of function minimization is given by Polak [1].) Now suppose that we knew the Hessian matrix A in equation (10.6.1). Then we could use the construction (10.6.2) to find successively conjugate directions h i along which to line-minimize. After N such, we would efficiently have arrived at the minimum of the quadratic form. But we don’t know A. Here is a remarkable theorem to save the day: Suppose we happen to have gi = −∇f(Pi), for some point Pi, where f is of the form (10.6.1). Suppose that we proceed from Pi along the direction hi to the local minimum of f located at some point Pi+1 and then set gi+1 = −∇f(Pi+1). Then, this gi+1 is the same vector as would have been constructed by equation (10.6.2). (And we have constructed it without knowledge of A!) Proof: By equation (10.5.3), gi = −A · Pi + b, and gi+1 = −A · (Pi + λhi) + b = gi − λA · hi (10.6.6) with λ chosen to take us to the line minimum. But at the line minimum hi · ∇f = −hi · gi+1 = 0. This latter condition is easily combined with (10.6.6) to solve for λ. The result is exactly the expression (10.6.4). But with this value of λ, (10.6.6) is the same as (10.6.2), q.e.d. We have, then, the basis of an algorithm that requires neither knowledge of the Hessian matrix A, nor even the storage necessary to store such a matrix. A sequence of directions hi is constructed, using only line minimizations, evaluations of the gradient vector, and an auxiliary vector to store the latest in the sequence of g’s. The algorithm described so far is the original Fletcher-Reeves version of the conjugate gradient algorithm. Later, Polak and Ribiere introduced one tiny, but sometimes significant, change. They proposed using the form γi = (gi+1 − gi) · gi+1 gi · gi (10.6.7)
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有