11.7 Eigenvalues or Eigenvectors by Inverse Iteration 493 CITED REFERENCES AND FURTHER READING: Wilkinson,J.H.,and Reinsch,C.1971,Linear Algebra,vol.Il of Handbook for Automatic Com- putation (New York:Springer-Verlag).[1] Golub,G.H.,and Van Loan,C.F.1989,Matrix Computations,2nd ed.(Baltimore:Johns Hopkins University Press),87.5. Smith,B.T.,et al.1976,Matrix Eigensystem Routines-EISPACK Guide,2nd ed.,vol.6 of Lecture Notes in Computer Science (New York:Springer-Verlag).[2] 11.7 Improving Eigenvalues and/or Finding Eigenvectors by Inverse Iteration The basic idea behind inverse iteration is quite simple.Let y be the solution of the linear system (A-T1).y=b (11.7.1) 令 where b is a random vector and r is close to some eigenvalue A of A.Then the 需 solution y will be close to the eigenvector corresponding to A.The procedure can be iterated:Replace b by y and solve for a new y,which will be even closer to the true eigenvector. Program We can see why this works by expanding both y and b as linear combinations of the eigenvectors xj of A: 色 OF SCIENTIFIC( 61 y=b=∑Bx (11.7.2) Then (11.7.1)gives ∑a-Ts=∑ (11.7.3) Numerica 10.621 so that 、86六 43106 月 =-T (11.7.4) (outside and y-刀月当 (11.7.5) 入-T If r is close to An,say,then provided Bn is not accidentally too small,y will be approximately xn,up to a normalization.Moreover,each iteration of this procedure gives another power ofj-in the denominator of(11.7.5).Thus the convergence is rapid for well-separated eigenvalues. Suppose at the kth stage of iteration we are solving the equation (A-k1)·y=bk (11.7.6)
11.7 Eigenvalues or Eigenvectors by Inverse Iteration 493 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). CITED REFERENCES AND FURTHER READING: Wilkinson, J.H., and Reinsch, C. 1971, Linear Algebra, vol. II of Handbook for Automatic Computation (New York: Springer-Verlag). [1] Golub, G.H., and Van Loan, C.F. 1989, Matrix Computations, 2nd ed. (Baltimore: Johns Hopkins University Press), §7.5. Smith, B.T., et al. 1976, Matrix Eigensystem Routines — EISPACK Guide, 2nd ed., vol. 6 of Lecture Notes in Computer Science (New York: Springer-Verlag). [2] 11.7 Improving Eigenvalues and/or Finding Eigenvectors by Inverse Iteration The basic idea behind inverse iteration is quite simple. Let y be the solution of the linear system (A − τ1) · y = b (11.7.1) where b is a random vector and τ is close to some eigenvalue λ of A. Then the solution y will be close to the eigenvector corresponding to λ. The procedure can be iterated: Replace b by y and solve for a new y, which will be even closer to the true eigenvector. We can see why this works by expanding both y and b as linear combinations of the eigenvectors xj of A: y = j αjxj b = j βjxj (11.7.2) Then (11.7.1) gives j αj (λj − τ)xj = j βjxj (11.7.3) so that αj = βj λj − τ (11.7.4) and y = j βjxj λj − τ (11.7.5) If τ is close to λn, say, then provided βn is not accidentally too small, y will be approximately xn, up to a normalization. Moreover, each iteration of this procedure gives another power of λj − τ in the denominator of (11.7.5). Thus the convergence is rapid for well-separated eigenvalues. Suppose at the kth stage of iteration we are solving the equation (A − τk1) · y = bk (11.7.6)
494 Chapter 11.Eigensystems where bk and Tk are our current guesses for some eigenvector and eigenvalue of interest (let's say,xn and An).Normalize b&so that bk.bk=1.The exact eigenvector and eigenvalue satisfy A·Xn=入nXn (11.7.7) SO (A-kl)·xn=(入n-Tk)xn (11.7.8) Since y of(11.7.6)is an improved approximation to xn,we normalize it and set be+1s之 yl (11.7.9) We get an improved estimate of the eigenvalue by substituting our improved guess y for xn in (11.7.8).By (11.7.6),the left-hand side is bk,so calling An our new value Tk+1,we find ICAL 1 Tk+1=Tk十 (11.7.10) bk·y While the above formulas look simple enough,in practice the implementation RECIPES can be quite tricky.The first question to be resolved is when to use inverse iteration. 9 Most of the computational load occurs in solving the linear system(11.7.6).Thus a possible strategy is first to reduce the matrix A to a special form that allows easy solution of(11.7.6).Tridiagonal form for symmetric matrices or Hessenberg for nonsymmetric are the obvious choices.Then apply inverse iteration to generate all the eigenvectors.While this is an O(N3)method for symmetric matrices,it is many times less efficient than the QL method given earlier.In fact,even the 、县芳 best inverse iteration packages are less efficient than the OL method as soon as more than about 25 percent of the eigenvectors are required.Accordingly,inverse iteration is generally used when one already has good eigenvalues and wants only a few selected eigenvectors. You can write a simple inverse iteration routine yourself using LU decompo- sition to solve (11.7.6).You can decide whether to use the general LU algorithm we gave in Chapter 2 or whether to take advantage of tridiagonal or Hessenberg form.Note that,since the linear system (11.7.6)is nearly singular,you must be NA Numerica 10-521 careful to use a version of LU decomposition like that in 82.3 which replaces a zero pivot with a very small number. We have chosen not to give a general inverse iteration routine in this book, because it is quite cumbersome to take account of all the cases that can arise. 9 Routines are given,for example,in [1.21.If you use these,or write your own routine, you may appreciate the following pointers. North One starts by supplying an initial value To for the eigenvalue An of interest. Choose a random normalized vector bo as the initial guess for the eigenvector xn, and solve (11.7.6).The new vector y is bigger than bo by a "growth factor"lyl, which ideally should be large.Equivalently,the change in the eigenvalue,which by (11.7.10)is essentially 1/ly,should be small.The following cases can arise: If the growth factor is too small initially,then we assume we have made a "bad"choice of random vector.This can happen not just because of a small Bn in (11.7.5),but also in the case of a defective matrix,when (11.7.5)does not even apply (see,e.g.,[1]or [3]for details).We go back to the beginning and choose a new initial vector
494 Chapter 11. Eigensystems Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). where bk and τk are our current guesses for some eigenvector and eigenvalue of interest (let’s say, xn and λn). Normalize bk so that bk · bk = 1. The exact eigenvector and eigenvalue satisfy A · xn = λnxn (11.7.7) so (A − τk1) · xn = (λn − τk)xn (11.7.8) Since y of (11.7.6) is an improved approximation to x n, we normalize it and set bk+1 = y |y| (11.7.9) We get an improved estimate of the eigenvalue by substituting our improved guess y for xn in (11.7.8). By (11.7.6), the left-hand side is b k, so calling λn our new value τk+1, we find τk+1 = τk + 1 bk · y (11.7.10) While the above formulas look simple enough, in practice the implementation can be quite tricky. The first question to be resolved is when to use inverse iteration. Most of the computational load occurs in solving the linear system (11.7.6). Thus a possible strategy is first to reduce the matrix A to a special form that allows easy solution of (11.7.6). Tridiagonal form for symmetric matrices or Hessenberg for nonsymmetric are the obvious choices. Then apply inverse iteration to generate all the eigenvectors. While this is an O(N 3) method for symmetric matrices, it is many times less efficient than the QL method given earlier. In fact, even the best inverse iteration packages are less efficient than the QL method as soon as more than about 25 percent of the eigenvectors are required. Accordingly, inverse iteration is generally used when one already has good eigenvalues and wants only a few selected eigenvectors. You can write a simple inverse iteration routine yourself using LU decomposition to solve (11.7.6). You can decide whether to use the general LU algorithm we gave in Chapter 2 or whether to take advantage of tridiagonal or Hessenberg form. Note that, since the linear system (11.7.6) is nearly singular, you must be careful to use a version of LU decomposition like that in §2.3 which replaces a zero pivot with a very small number. We have chosen not to give a general inverse iteration routine in this book, because it is quite cumbersome to take account of all the cases that can arise. Routines are given, for example, in [1,2]. If you use these, or write your own routine, you may appreciate the following pointers. One starts by supplying an initial value τ0 for the eigenvalue λn of interest. Choose a random normalized vector b0 as the initial guess for the eigenvector xn, and solve (11.7.6). The new vector y is bigger than b 0 by a “growth factor” |y|, which ideally should be large. Equivalently, the change in the eigenvalue, which by (11.7.10) is essentially 1/ |y|, should be small. The following cases can arise: • If the growth factor is too small initially, then we assume we have made a “bad” choice of random vector. This can happen not just because of a small βn in (11.7.5), but also in the case of a defective matrix, when (11.7.5) does not even apply (see, e.g., [1] or [3] for details). We go back to the beginning and choose a new initial vector
11.7 Eigenvalues or Eigenvectors by Inverse Iteration 495 The change b1-bo might be less than some tolerance e.We can use this as a criterion for stopping,iterating until it is satisfied,with a maximum of 5-10 iterations,say. After a few iterations,if b+1-b is not decreasing rapidly enough, we can try updating the eigenvalue according to (11.7.10).If=T to machine accuracy,we are not going to improve the eigenvector much more and can quit.Otherwise start another cycle of iterations with the new eigenvalue. The reason we do not update the eigenvalue at every step is that when we solve the linear system (11.7.6)by LU decomposition,we can save the decomposition if T is fixed.We only need do the backsubstitution step each time we update bk. g The number of iterations we decide to do with a fixed Tk is a trade-off between the quadratic convergence but O(N3)workload for updating T at each step and the linear convergence but O(N2)load for keeping Tk fixed.Ifyou have determined the eigenvalue by one of the routines given earlier in the chapter,it is probably correct to machine accuracy anyway,and you can omit updating it. 之 There are two different pathologies that can arise during inverse iteration.The first is multiple or closely spaced roots.This is more often a problem with symmetric matrices.Inverse iteration will find only one eigenvector for a given initial guess To. 9 A good strategy is to perturb the last few significant digits in To and then repeat the iteration.Usually this provides an independent eigenvector.Special steps generally have to be taken to ensure orthogonality of the linearly independent eigenvectors. whereas the Jacobi and QL algorithms automatically yield orthogonal eigenvectors even in the case of multiple eigenvalues. The second problem,peculiar to nonsymmetric matrices,is the defective case. OF SCIENTIFIC Unless one makes a "good"initial guess,the growth factor is small.Moreover, iteration does not improve matters.In this case,the remedy is to choose random 可 initial vectors,solve(11.7.6)once,and quit as soon as any vector gives an acceptably large growth factor.Typically only a few trials are necessary. One further complication in the nonsymmetric case is that a real matrix can have complex-conjugate pairs of eigenvalues.You will then have to use complex arithmetic to solve (11.7.6)for the complex eigenvectors.For any moderate-sized (or larger)nonsymmetric matrix,our recommendation is to avoid inverse iteration in favor of a QR method that includes the eigenvector computation in complex arithmetic.You will find routines for this in [1,2]and other places. Numerical Recipes 10621 43106 (outside CITED REFERENCES AND FURTHER READING: Acton,F.S.1970,Numerica/Methods That Work;1990,corrected edition (Washington:Mathe- North Software. matical Association of America). Wilkinson,J.H.,and Reinsch,C.1971,Linear Algebra,vol.ll of Handbook for Automatic Com- putation (New York:Springer-Verlag),p.418.[1] Smith,B.T.,et al.1976,Matrix Eigensystem Routines-E/SPACK Guide,2nd ed.,vol.6 of Lecture Notes in Computer Science (New York:Springer-Verlag).[2] Stoer,J.,and Bulirsch,R.1980,Introduction to Numerical Analysis(New York:Springer-Verlag). p.356.3]
11.7 Eigenvalues or Eigenvectors by Inverse Iteration 495 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). • The change |b1 − b0| might be less than some tolerance . We can use this as a criterion for stopping, iterating until it is satisfied, with a maximum of 5 – 10 iterations, say. • After a few iterations, if |bk+1 − bk| is not decreasing rapidly enough, we can try updating the eigenvalue according to (11.7.10). If τ k+1 = τk to machine accuracy, we are not going to improve the eigenvector much more and can quit. Otherwise start another cycle of iterations with the new eigenvalue. The reason we do not update the eigenvalue at every step is that when we solve the linear system (11.7.6) by LU decomposition, we can save the decomposition if τk is fixed. We only need do the backsubstitution step each time we update b k. The number of iterations we decide to do with a fixed τ k is a trade-off between the quadratic convergence but O(N 3) workload for updating τk at each step and the linear convergence but O(N 2) load for keeping τk fixed. If you have determined the eigenvalue by one of the routines given earlier in the chapter, it is probably correct to machine accuracy anyway, and you can omit updating it. There are two different pathologies that can arise during inverse iteration. The first is multiple or closely spaced roots. This is more often a problem with symmetric matrices. Inverse iteration will find only one eigenvector for a given initial guess τ 0. A good strategy is to perturb the last few significant digits in τ0 and then repeat the iteration. Usually this provides an independent eigenvector. Special steps generally have to be taken to ensure orthogonality of the linearly independent eigenvectors, whereas the Jacobi and QL algorithms automatically yield orthogonal eigenvectors even in the case of multiple eigenvalues. The second problem, peculiar to nonsymmetric matrices, is the defective case. Unless one makes a “good” initial guess, the growth factor is small. Moreover, iteration does not improve matters. In this case, the remedy is to choose random initial vectors, solve (11.7.6) once, and quit as soon as any vector gives an acceptably large growth factor. Typically only a few trials are necessary. One further complication in the nonsymmetric case is that a real matrix can have complex-conjugate pairs of eigenvalues. You will then have to use complex arithmetic to solve (11.7.6) for the complex eigenvectors. For any moderate-sized (or larger) nonsymmetric matrix, our recommendation is to avoid inverse iteration in favor of a QR method that includes the eigenvector computation in complex arithmetic. You will find routines for this in [1,2] and other places. CITED REFERENCES AND FURTHER READING: Acton, F.S. 1970, Numerical Methods That Work; 1990, corrected edition (Washington: Mathematical Association of America). Wilkinson, J.H., and Reinsch, C. 1971, Linear Algebra, vol. II of Handbook for Automatic Computation (New York: Springer-Verlag), p. 418. [1] Smith, B.T., et al. 1976, Matrix Eigensystem Routines — EISPACK Guide, 2nd ed., vol. 6 of Lecture Notes in Computer Science (New York: Springer-Verlag). [2] Stoer, J., and Bulirsch, R. 1980, Introduction to Numerical Analysis (New York: Springer-Verlag), p. 356. [3]