2.6 Singular Value Decomposition 59 2.6 Singular Value Decomposition There exists a very powerful set of techniques for dealing with sets ofequations or matrices that are either singular or else numerically very close to singular.In many cases where Gaussian elimination and LU decomposition fail to give satisfactory results,this set of techniques,known as singular value decomposition,or SVD, will diagnose for you precisely what the problem is.In some cases,SVD will not only diagnose the problem,it will also solve it,in the sense of giving you a useful numerical answer,although,as we shall see,not necessarily "the"answer that you thought you should get. SVD is also the method of choice for solving most linear least-squares problems. We will outline the relevant theory in this section,but defer detailed discussion of the use of SVD in this application to Chapter 15,whose subject is the parametric modeling of data. SVD methods are based on the following theorem oflinear algebra,whose proof is beyond our scope:Any M x N matrix A whose number of rows M is greater than or equal to its number of columns N,can be written as the product of an M x N column-orthogonal matrix U,an N x N diagonal matrix W with positive or zero (Nort serve elements (the singular values),and the transpose of an N x N orthogonal matrix V. America computer, make one paper University Press. THE The various shapes of these matrices will be made clearer by the following tableau: ART Programs send ! email to Copyright (C) ectcustser OF SCIENTIFIC COMPUTING (ISBN 1988-199200 v@cam 10-:6211 (2.6.1) ridge.org Numerical Recipes (outside The matrices U and V are each orthogonal in the sense that their columns are North Software. orthonormal. M 1≤k≤N ∑ttUm=6t 1≤n≤W (2.6.2) N 1≤k≤N 人VVn=6n 1≤n≤N (2.6.3)
2.6 Singular Value Decomposition 59 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). 2.6 Singular Value Decomposition There exists a very powerful set of techniques for dealing with sets of equations or matrices that are either singular or else numerically very close to singular. In many cases where Gaussian elimination and LU decomposition fail to give satisfactory results, this set of techniques, known as singular value decomposition, or SVD, will diagnose for you precisely what the problem is. In some cases, SVD will not only diagnose the problem, it will also solve it, in the sense of giving you a useful numerical answer, although, as we shall see, not necessarily “the” answer that you thought you should get. SVD is also the method of choice for solving most linear least-squares problems. We will outline the relevant theory in this section, but defer detailed discussion of the use of SVD in this application to Chapter 15, whose subject is the parametric modeling of data. SVD methods are based on the following theorem of linear algebra, whose proof is beyond our scope: Any M × N matrix A whose number of rows M is greater than or equal to its number of columns N, can be written as the product of an M × N column-orthogonal matrix U, an N × N diagonal matrix W with positive or zero elements (the singular values), and the transpose of an N × N orthogonal matrix V. The various shapes of these matrices will be made clearer by the following tableau: A = U · w1 w2 ··· ··· wN · VT (2.6.1) The matrices U and V are each orthogonal in the sense that their columns are orthonormal, M i=1 UikUin = δkn 1 ≤ k ≤ N 1 ≤ n ≤ N (2.6.2) N j=1 VjkVjn = δkn 1 ≤ k ≤ N 1 ≤ n ≤ N (2.6.3)
60 Chapter 2.Solution of Linear Algebraic Equations or as a tableau, U 。f4 83 granted for 198891992 11800 (2.6.4) from NUMERICAL RECIPESI Since V is square,it is also row-orthonormal,V.VT =1. The SVD decomposition can also be carried out when M<N.In this case (Nort server 9 the singular values wj for j=M+1,...,N are all zero,and the corresponding columns of U are also zero.Equation (2.6.2)then holds only for k,n <M. America The decomposition (2.6.1)can always be done,no matter how singular the matrix is,and it is"almost"unique.That is to say,it is unique up to (i)making the same permutation of the columns of U,elements of W,and columns of V(or Programs rows of V),or (ii)forming linear combinations of any columns of U and V whose corresponding elements of W happen to be exactly equal.An important consequence of the permutation freedom is that for the case M<N,a numerical algorithm for the decomposition need not return zero wi's for j=M+1,...,N;the N-M 6 zero singular values can be scattered among all positionsj=1,2,...,N. At the end of this section,we give a routine,svdcmp,that performs SVD on an arbitrary matrix A,replacing it by U(they are the same shape)and giving back W and V separately.The routine svdcmp is based on a routine by Forsythe et al.[1],which is in turn based on the original routine of Golub and Reinsch,found,in various forms,in [2-4]and elsewhere.These references include extensive discussion of the algorithm used.As much as we dislike the use of black-box routines,we are Fuurggoglrion going to ask you to accept this one,since it would take us too far afield to cover Numerical Recipes 10621 43108. its necessary background material here.Suffice it to say that the algorithm is very stable,and that it is very unusual for it ever to misbehave.Most of the concepts that (outside enter the algorithm(Householder reduction to bidiagonal form,diagonalization by North Software. QR procedure with shifts)will be discussed further in Chapter 11. Ifyou are as suspicious of black boxes as we are,you will want to verify yourself that svdcmp does what we say it does.That is very easy to do:Generate an arbitrary matrix A,call the routine,and then verify by matrix multiplication that (2.6.1)and (2.6.4)are satisfied.Since these two equations are the only defining requirements for SVD,this procedure is(for the chosen A)a complete end-to-end check. Now let us find out what SVD is good for
60 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). or as a tableau, UT · U = VT · V = 1 (2.6.4) Since V is square, it is also row-orthonormal, V · VT = 1. The SVD decomposition can also be carried out when M<N. In this case the singular values wj for j = M + 1,...,N are all zero, and the corresponding columns of U are also zero. Equation (2.6.2) then holds only for k, n ≤ M. The decomposition (2.6.1) can always be done, no matter how singular the matrix is, and it is “almost” unique. That is to say, it is unique up to (i) making the same permutation of the columns of U, elements of W, and columns of V (or rows of VT ), or (ii) forming linear combinations of any columns of U and V whose corresponding elements of W happen to be exactly equal. An important consequence of the permutation freedom is that for the case M<N, a numerical algorithm for the decomposition need not return zero wj ’s for j = M + 1,...,N; the N − M zero singular values can be scattered among all positions j = 1, 2,...,N. At the end of this section, we give a routine, svdcmp, that performs SVD on an arbitrary matrix A, replacing it by U (they are the same shape) and giving back W and V separately. The routine svdcmp is based on a routine by Forsythe et al. [1], which is in turn based on the original routine of Golub and Reinsch, found, in various forms, in [2-4] and elsewhere. These references include extensive discussion of the algorithm used. As much as we dislike the use of black-box routines, we are going to ask you to accept this one, since it would take us too far afield to cover its necessary background material here. Suffice it to say that the algorithm is very stable, and that it is very unusual for it ever to misbehave. Most of the concepts that enter the algorithm (Householder reduction to bidiagonal form, diagonalization by QR procedure with shifts) will be discussed further in Chapter 11. If you are as suspicious of black boxes as we are, you will want to verify yourself that svdcmp does what we say it does. That is very easy to do: Generate an arbitrary matrix A, call the routine, and then verify by matrix multiplication that (2.6.1) and (2.6.4) are satisfied. Since these two equations are the only defining requirements for SVD, this procedure is (for the chosen A) a complete end-to-end check. Now let us find out what SVD is good for
2.6 Singular Value Decomposition 61 SVD of a Square Matrix If the matrix A is square,N x N say,then U,V,and W are all square matrices of the same size.Their inverses are also trivial to compute:U and V are orthogonal, so their inverses are equal to their transposes;W is diagonal,so its inverse is the diagonal matrix whose elements are the reciprocals of the elements w.From(2.6.1) it now follows immediately that the inverse of A is A-1=V.[diag(1/u】.UT (2.6.5) The only thing that can go wrong with this construction is for one of the wi's to be zero,or (numerically)for it to be so small that its value is dominated by roundoff error and therefore unknowable.If more than one of the wi's have this problem,then the matrix is even more singular.So,first of all,SVD gives you a clear diagnosis of the situation. Formally,the condition number of a matrix is defined as the ratio of the largest (in magnitude)of the wi's to the smallest of the wi's.A matrix is singular if its RECIPES condition number is infinite,and it is ill-conditioned if its condition number is too large,that is,if its reciprocal approaches the machine's floating-point precision(for 9 example,less than 10-6 for single precision or 10-12 for double). For singular matrices,the concepts of nullspace and range are important. Consider the familiar set of simultaneous equations A·X=b (2.6.6) 。4包A的 where A is a square matrix,b and x are vectors.Equation (2.6.6)defines A as a linear mapping from the vector space x to the vector space b.If A is singular,then there is some subspace of x,called the nullspace,that is mapped to zero,A.x=0. The dimension of the nullspace (the number of linearly independent vectors x that can be found in it)is called the nullity of A. Now,there is also some subspace of b that can be"reached"by A.in the sense that there exists some x which is mapped there.This subspace of b is called the range of A.The dimension of the range is called the rank of A.If A is nonsingular,then its Numerica 10621 range will be all of the vector space b,so its rank is N.If A is singular,then the rank will be less than N.In fact,the relevant theorem is"rank plus nullity equals N." What has this to do with SVD?SVD explicitly constructs orthonormal bases for the nullspace and range of a matrix.Specifically,the columns of U whose 。指 same-numbered elements wj are nonzero are an orthonormal set of basis vectors that span the range;the columns of V whose same-numbered elements wj are zero are an orthonormal basis for the nullspace. Now let's have another look at solving the set of simultaneous linear equations (2.6.6)in the case that A is singular.First,the set of homogeneous equations,where b=0,is solved immediately by SVD:Any column of V whose corresponding wj is zero yields a solution. When the vector b on the right-hand side is not zero,the important question is whether it lies in the range of A or not.If it does,then the singular set of equations does have a solution x:in fact it has more than one solution,since any vector in the nullspace (any column of V with a corresponding zero wi)can be added to x in any linear combination
2.6 Singular Value Decomposition 61 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). SVD of a Square Matrix If the matrix A is square, N × N say, then U, V, and W are all square matrices of the same size. Their inverses are also trivial to compute: U and V are orthogonal, so their inverses are equal to their transposes; W is diagonal, so its inverse is the diagonal matrix whose elements are the reciprocals of the elements wj . From (2.6.1) it now follows immediately that the inverse of A is A−1 = V · [diag (1/wj )] · UT (2.6.5) The only thing that can go wrong with this construction is for one of the w j ’s to be zero, or (numerically) for it to be so small that its value is dominated by roundoff error and therefore unknowable. If more than one of the w j ’s have this problem, then the matrix is even more singular. So, first of all, SVD gives you a clear diagnosis of the situation. Formally, the condition number of a matrix is defined as the ratio of the largest (in magnitude) of the wj ’s to the smallest of the wj ’s. A matrix is singular if its condition number is infinite, and it is ill-conditioned if its condition number is too large, that is, if its reciprocal approaches the machine’s floating-point precision (for example, less than 10−6 for single precision or 10−12 for double). For singular matrices, the concepts of nullspace and range are important. Consider the familiar set of simultaneous equations A · x = b (2.6.6) where A is a square matrix, b and x are vectors. Equation (2.6.6) defines A as a linear mapping from the vector space x to the vector space b. If A is singular, then there is some subspace of x, called the nullspace, that is mapped to zero, A · x = 0. The dimension of the nullspace (the number of linearly independent vectors x that can be found in it) is called the nullity of A. Now, there is also some subspace of b that can be “reached” by A, in the sense that there exists some x which is mapped there. This subspace of b is called the range of A. The dimension of the range is called the rank of A. If A is nonsingular, then its range will be all of the vector space b, so its rank is N. If A is singular, then the rank will be less than N. In fact, the relevant theorem is “rank plus nullity equals N.” What has this to do with SVD? SVD explicitly constructs orthonormal bases for the nullspace and range of a matrix. Specifically, the columns of U whose same-numbered elements wj are nonzero are an orthonormal set of basis vectors that span the range; the columns of V whose same-numbered elements wj are zero are an orthonormal basis for the nullspace. Now let’s have another look at solving the set of simultaneous linear equations (2.6.6) in the case that A is singular. First, the set of homogeneous equations, where b = 0, is solved immediately by SVD: Any column of V whose corresponding w j is zero yields a solution. When the vector b on the right-hand side is not zero, the important question is whether it lies in the range of A or not. If it does, then the singular set of equations does have a solution x; in fact it has more than one solution, since any vector in the nullspace (any column of V with a corresponding zero wj ) can be added to x in any linear combination
62 Chapter 2.Solution of Linear Algebraic Equations If we want to single out one particular member of this solution-set of vectors as a representative,we might want to pick the one with the smallest lengthx2.Here is how to find that vector using SVD:Simply replace 1/wj by zero ifwj =0.(It is not very often that one gets to set oo =0!)Then compute (working from right to left) x=V·[diag(1/u】·(ur.b) (2.6.7) This will be the solution vector of smallest length:the columns of V that are in the nullspace complete the specification of the solution set. Proof:Considerx+x',where x'lies in the nullspace.Then,if W-denotes the modified inverse of W with some elements zeroed. x+x=V.W-1.UT.b+x ICAL =V.(w-1.UT.b+vT.x) (2.6.8) =w-1.UT.b+vT.xl Here the first equality follows from(2.6.7),the second and third from the orthonor- mality of V.If you now examine the two terms that make up the sum on the 玉梦 9 right-hand side,you will see that the first one has nonzero j components only where 0,while the second one,since x'is in the nullspace,has nonzero j components only where w;=0.Therefore the minimum length obtains for x'=0,q.e.d. 里的 9 If b is not in the range of the singular matrix A,then the set of equations(2.6.6) has no solution.But here is some good news:If b is not in the range of A,then equation (2.6.7)can still be used to construct a "solution"vector x.This vector x will not exactly solve A.x b.But,among all possible vectors x,it will do the closest possible job in the least squares sense.In other words(2.6.7)finds x which minimizes r≡lA·x-bl (2.6.9) The number r is called the residual of the solution. 10621 The proof is similar to(2.6.8):Suppose we modify x by adding some arbitrary Numerica x'.Then A·x-b is modified by adding some b'≡A·x'.Obviously b'isin the range of A.We then have A.x-b+b/l=|(U.W.vT).(V.W-1.UT.b)-b+b'] =(U.W.W-1.U-1).b+b/ =U.[(W.w-1-1).UT.b+UT.b']l (2.6.10) (W .W-1-1).UT.b+UT.b' Now,(W.W-1-1)is a diagonal matrix which has nonzero j components only for w=0,while UTb'has nonzero j components only for wj0,since b'lies in the range of A.Therefore the minimum obtains for b'=0,g.e.d. Figure 2.6.1 summarizes our discussion of SVD thus far
62 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). If we want to single out one particular member of this solution-set of vectors as a representative, we might want to pick the one with the smallest length |x| 2 . Here is how to find that vector using SVD: Simply replace 1/wj by zero if wj = 0. (It is not very often that one gets to set ∞ = 0 !) Then compute (working from right to left) x = V · [diag (1/wj )] · (UT · b) (2.6.7) This will be the solution vector of smallest length; the columns of V that are in the nullspace complete the specification of the solution set. Proof: Consider |x + x |, where x lies in the nullspace. Then, if W−1 denotes the modified inverse of W with some elements zeroed, |x + x | = V · W−1 · UT · b + x = V · (W−1 · UT · b + VT · x ) = W−1 · UT · b + VT · x (2.6.8) Here the first equality follows from (2.6.7), the second and third from the orthonormality of V. If you now examine the two terms that make up the sum on the right-hand side, you will see that the first one has nonzero j components only where wj = 0, while the second one, since x is in the nullspace, has nonzero j components only where wj = 0. Therefore the minimum length obtains for x = 0, q.e.d. If b is not in the range of the singular matrix A, then the set of equations (2.6.6) has no solution. But here is some good news: If b is not in the range of A, then equation (2.6.7) can still be used to construct a “solution” vector x. This vector x will not exactly solve A · x = b. But, among all possible vectors x, it will do the closest possible job in the least squares sense. In other words (2.6.7) finds x which minimizes r ≡ |A · x − b| (2.6.9) The number r is called the residual of the solution. The proof is similar to (2.6.8): Suppose we modify x by adding some arbitrary x . Then A · x − b is modified by adding some b ≡ A · x . Obviously b is in the range of A. We then have A · x − b + b = (U · W · VT ) · (V · W−1 · UT · b) − b + b = (U · W · W−1 · UT − 1) · b + b = U · (W · W−1 − 1) · UT · b + UT · b = (W · W−1 − 1) · UT · b + UT · b (2.6.10) Now, (W · W−1 − 1) is a diagonal matrix which has nonzero j components only for wj = 0, while UT b has nonzero j components only for wj = 0, since b lies in the range of A. Therefore the minimum obtains for b = 0, q.e.d. Figure 2.6.1 summarizes our discussion of SVD thus far.
2.6 Singular Value Decomposition 63 A A·x=b mnttt granted for (a) (including this one) interne null /Cambridge space solutions of n NUMERICAL RECIPES IN C: solutions of A·X=c A·X=d -7423 (North America to any server computer, uae us e University Press. THE SVD“solution" ofA·X=c 是 range of A send d st st copyfor thei Programs to dir Copyright (C) SVD solution of ART OF SCIENTIFIC COMPUTING(ISBN A·x=d (b) v@cam Figure 2.6.1.(a)A nonsingular matrix A maps a vector space into one of the same dimension.The 1988-1992 by Numerical Recipes 10-621 vector x is mapped into b,so that x satisfies the equation A.x b.(b)A singular matrix A maps a vector space into one of lower dimensionality,here a plane into a line,called the "range"of A.The "nullspace"of A is mapped to zero.The solutions of A.x d consist of any one particular solution plus 43108.5 any vector in the nullspace,here forming a line parallel to the nullspace.Singular value decomposition (SVD)selects the particular solution closest to zero,as shown.The point c lies outside of the range of A,so A.x =c has no solution.SVD finds the least-squares best compromise solution,namely a (outside 膜 solution of A.x ='as shown. North Software. In the discussion since equation(2.6.6).we have been pretending that a matrix either is singular or else isn't.That is of course true analytically.Numerically visit website however,the far more common situation is that some of the w;'s are very small machine but nonzero,so that the matrix is ill-conditioned.In that case,the direct solution methods of LU decomposition or Gaussian elimination may actually give a formal solution to the set of equations(that is,a zero pivot may not be encountered);but the solution vector may have wildly large components whose algebraic cancellation. when multiplying by the matrix A,may give a very poor approximation to the right-hand vector b.In such cases,the solution vector x obtained by zeroing the
2.6 Singular Value Decomposition 63 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). A ⋅ x = b SVD “solution” of A ⋅ x = c solutions of solutions of A ⋅ x = c′ A ⋅ x = d null space of A SVD solution of A ⋅ x = d range of A d c (b) (a) A x b c′ Figure 2.6.1. (a) A nonsingular matrix A maps a vector space into one of the same dimension. The vector x is mapped into b, so that x satisfies the equation A · x = b. (b) A singular matrix A maps a vector space into one of lower dimensionality, here a plane into a line, called the “range” of A. The “nullspace” of A is mapped to zero. The solutions of A · x = d consist of any one particular solution plus any vector in the nullspace, here forming a line parallel to the nullspace. Singular value decomposition (SVD) selects the particular solution closest to zero, as shown. The point c lies outside of the range of A, so A · x = c has no solution. SVD finds the least-squares best compromise solution, namely a solution of A · x = c, as shown. In the discussion since equation (2.6.6), we have been pretending that a matrix either is singular or else isn’t. That is of course true analytically. Numerically, however, the far more common situation is that some of the wj ’s are very small but nonzero, so that the matrix is ill-conditioned. In that case, the direct solution methods of LU decomposition or Gaussian elimination may actually give a formal solution to the set of equations (that is, a zero pivot may not be encountered); but the solution vector may have wildly large components whose algebraic cancellation, when multiplying by the matrix A, may give a very poor approximation to the right-hand vector b. In such cases, the solution vector x obtained by zeroing the
64 Chapter 2.Solution of Linear Algebraic Equations small wi's and then using equation (2.6.7)is very often better(in the sense of the residual A.x-b being smaller)than both the direct-method solution and the SVD solution where the small wi's are left nonzero. It may seem paradoxical that this can be so,since zeroing a singular value corresponds to throwing away one linear combination of the set of equations that we are trying to solve.The resolution of the paradox is that we are throwing away precisely a combination ofequations that is so corrupted by roundofferror as to be at best useless;usually it is worse than useless since it"pulls"the solution vector way off towards infinity along some direction that is almost a nullspace vector.In doing this,it compounds the roundoff problem and makes the residual A.x-b larger. 81 SVD cannot be applied blindly,then.You have to exercise some discretion in deciding at what threshold to zero the small wi's,and/or you have to have some idea nted for what size of computed residual A.x-b is acceptable. As an example,here is a "backsubstitution"routine svbksb for evaluating equation(2.6.7)and obtaining a solution vector x from a right-hand side b,given that the SVD of a matrix A has already been calculated by a call to svdcmp.Note ERICA that this routine presumes that you have already zeroed the small w,'s.It does not do this for you.If you haven't zeroed the small wi's,then this routine is just as 令 ill-conditioned as any direct method,and you are misusing SVD. #include "nrutil.h" void svbksb(float **u,float w[,float **v,int m,int n,float b[],float x) Solves A.X=B for a vector X,where A is specified by the arrays u[1..m][1..n],w[1..n], 9 Program v[1..n][1..n]as returned by svdcmp.m and n are the dimensions of a,and will be equal for square matrices.b[1..m]is the input right-hand side.x [1..n]is the output solution vector. No input quantities are destroyed,so the routine may be called sequentially with different b's. int jj,j,i; float s,*tmpi tmp=vector(1,n); for(j=1;j<=n;j++){ Calculate UTB. s=0.0; if (w[j]){ Nonzero result only if wj is nonzero. for(i=1;i<=m;1++)s+=u[1][j门*b[i]; v@cambri s/=w[j]; This is the divide by wj tmp[j]=s; 1988-1992 by Numerical Recipes OF SCIENTIFIC COMPUTING (ISBN 10621 43108 d for(j=1;j<=n;j++){ Matrix multiply by V to get answer. s=0.0; for (jj=1;jj<=n;jj++)s 膜 +v[j][jj]*tmp[jj]; x[j]=s; (outside North Amer Software. free_vector(tmp,1,n); Note that a typical use of svdcmp and svbksb superficially resembles the typical use of ludcmp and lubksb:In both cases,you decompose the left-hand matrix A just once,and then can use the decomposition either once or many times with different right-hand sides.The crucial difference is the "editing"of the singular values before svbksb is called:
64 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). small wj ’s and then using equation (2.6.7) is very often better (in the sense of the residual |A · x − b| being smaller) than both the direct-method solution and the SVD solution where the small wj ’s are left nonzero. It may seem paradoxical that this can be so, since zeroing a singular value corresponds to throwing away one linear combination of the set of equations that we are trying to solve. The resolution of the paradox is that we are throwing away precisely a combination of equations that is so corrupted by roundoff error as to be at best useless; usually it is worse than useless since it “pulls” the solution vector way off towards infinity along some direction that is almost a nullspace vector. In doing this, it compounds the roundoff problem and makes the residual |A · x − b| larger. SVD cannot be applied blindly, then. You have to exercise some discretion in deciding at what threshold to zero the small wj ’s, and/or you have to have some idea what size of computed residual |A · x − b| is acceptable. As an example, here is a “backsubstitution” routine svbksb for evaluating equation (2.6.7) and obtaining a solution vector x from a right-hand side b, given that the SVD of a matrix A has already been calculated by a call to svdcmp. Note that this routine presumes that you have already zeroed the small wj ’s. It does not do this for you. If you haven’t zeroed the small wj ’s, then this routine is just as ill-conditioned as any direct method, and you are misusing SVD. #include "nrutil.h" void svbksb(float **u, float w[], float **v, int m, int n, float b[], float x[]) Solves A·X = B for a vector X, where A is specified by the arrays u[1..m][1..n], w[1..n], v[1..n][1..n] as returned by svdcmp. m and n are the dimensions of a, and will be equal for square matrices. b[1..m] is the input right-hand side. x[1..n] is the output solution vector. No input quantities are destroyed, so the routine may be called sequentially with different b’s. { int jj,j,i; float s,*tmp; tmp=vector(1,n); for (j=1;j<=n;j++) { Calculate U T B. s=0.0; if (w[j]) { Nonzero result only if wj is nonzero. for (i=1;i<=m;i++) s += u[i][j]*b[i]; s /= w[j]; This is the divide by wj . } tmp[j]=s; } for (j=1;j<=n;j++) { Matrix multiply by V to get answer. s=0.0; for (jj=1;jj<=n;jj++) s += v[j][jj]*tmp[jj]; x[j]=s; } free_vector(tmp,1,n); } Note that a typical use of svdcmp and svbksb superficially resembles the typical use of ludcmp and lubksb: In both cases, you decompose the left-hand matrix A just once, and then can use the decomposition either once or many times with different right-hand sides. The crucial difference is the “editing” of the singular values before svbksb is called:
2.6 Singular Value Decomposition 65 #define N.· f10 at wmax,m1n,**a,**u,*智,**v,米b,*x; int i,ji 。。4 for(i=1;i<=N:1++) Copy a into u if you don't want it to be de- for j=1;j<=N;j++) stroyed. u[i][j]=a[i][]: svdcmp(u,N,N,w,v); SVD the square matrix a. wmax=0.0; Will be the maximum singular value obtained. for(j=1;j<=N;j++)if (w[j]wmax)wmax=w[j]; This is where we set the threshold for singular values allowed to be nonzero.The constant is typical,but not universal.You have to experiment with your own application. wmin=wmax*1.0e-6; for(j=1;j<-N;j++)if (w[j]wmin)w[j]=0.0; svbksb(u,w,v,N,N,b,x); Now we can backsubstitute. -200 SVD for Fewer Equations than Unknowns ⊙ If you have fewer linear equations M than unknowns N,then you are not expecting a unique solution.Usually there will be an N-M dimensional family 需 of solutions.If you want to find this whole solution space,then SVD can readily 2aessadNS Press. do the job. The SVD decomposition will yield N-M zero or negligible wi's,since M<N.There may be additional zero wi's from any degeneracies in your M equations.Be sure that you find this many small wi's,and zero them before calling OF SCIENTIFIC( svbksb,which will give you the particular solution vector x.As before,the columns of V corresponding to zeroed wj's are the basis vectors whose linear combinations, 61 added to the particular solution,span the solution space. SVD for More Equations than Unknowns COMPUTING (ISBN 188810920 10-6211 This situation will occur in Chapter 15.when we wish to find the least-squares solution to an overdetermined set of linear equations.In tableau,the equations uurggoglrion Numerical Recipes 43106 to be solved are (outside Software. ying of (2.6.11) The proofs that we gave above for the square case apply without modification to the case of more equations than unknowns.The least-squares solution vector x is
2.6 Singular Value Decomposition 65 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). #define N ... float wmax,wmin,**a,**u,*w,**v,*b,*x; int i,j; ... for(i=1;i wmax) wmax=w[j]; This is where we set the threshold for singular values allowed to be nonzero. The constant is typical, but not universal. You have to experiment with your own application. wmin=wmax*1.0e-6; for(j=1;j<=N;j++) if (w[j] < wmin) w[j]=0.0; svbksb(u,w,v,N,N,b,x); Now we can backsubstitute. SVD for Fewer Equations than Unknowns If you have fewer linear equations M than unknowns N, then you are not expecting a unique solution. Usually there will be an N − M dimensional family of solutions. If you want to find this whole solution space, then SVD can readily do the job. The SVD decomposition will yield N − M zero or negligible wj ’s, since M<N. There may be additional zero wj ’s from any degeneracies in your M equations. Be sure that you find this many small wj ’s, and zero them before calling svbksb, which will give you the particular solution vector x. As before, the columns of V corresponding to zeroed wj ’s are the basis vectors whose linear combinations, added to the particular solution, span the solution space. SVD for More Equations than Unknowns This situation will occur in Chapter 15, when we wish to find the least-squares solution to an overdetermined set of linear equations. In tableau, the equations to be solved are A · x = b (2.6.11) The proofs that we gave above for the square case apply without modification to the case of more equations than unknowns. The least-squares solution vector x is
66 Chapter 2.Solution of Linear Algebraic Equations given by (2.6.7),which,with nonsquare matrices,looks like this, 0(· (2.6.12) In general,the matrix W will not be singular,and no wi's will need to be nted for set to zero.Occasionally,however,there might be column degeneracies in A.In this case you will need to zero some small w;values after all.The corresponding column in V gives the linear combination of x's that is then ill-determined even by the supposedly overdetermined set. Sometimes,although you do not need to zero any wi's for computational reasons,you may nevertheless want to take note of any that are unusually small: 8 Their corresponding columns in V are linear combinations ofx's which are insensitive to your data.In fact,you may then wish to zero these wi's,to reduce the number of Press. free parameters in the fit.These matters are discussed more fully in Chapter 15. Constructing an Orthonormal Basis Suppose that you have N vectors in an M-dimensional vector space,with OF SCIENTIFIC N<M.Then the N vectors span some subspace of the full vector space. Often you want to construct an orthonormal set of N vectors that span the same 可 subspace.The textbook way to do this is by Gram-Schmidt orthogonalization, starting with one vector and then expanding the subspace one dimension at a time.Numerically,however,because of the build-up of roundoff errors,naive Gram-Schmidt orthogonalization is terrible. The right way to construct an orthonormal basis for a subspace is by SVD: Form an M x N matrix A whose N columns are your vectors.Run the matrix 10.621 through svdcmp.The columns of the matrix U(which in fact replaces A on output Fuurggoglrion from svdcmp)are your desired orthonormal basis vectors. Numerical Recipes 43106 You might also want to check the output w;'s for zero values.If any occur. then the spanned subspace was not,in fact,N dimensional;the columns of U (outside corresponding to zero wi's should be discarded from the orthonormal basis set. (QR factorization,discussed in $2.10,also constructs an orthonormal basis, North Software. see [51.) Approximation of Matrices Note that equation(2.6.1)can be rewritten to express any matrix Aij as a sum of outer products of columns of U and rows of v,with the"weighting factors" being the singular values wj, N Ai=J (2.6.13)
66 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). given by (2.6.7), which, with nonsquare matrices, looks like this, x = V · diag(1/wj) · UT · b (2.6.12) In general, the matrix W will not be singular, and no wj ’s will need to be set to zero. Occasionally, however, there might be column degeneracies in A. In this case you will need to zero some small wj values after all. The corresponding column in V gives the linear combination of x’s that is then ill-determined even by the supposedly overdetermined set. Sometimes, although you do not need to zero any wj ’s for computational reasons, you may nevertheless want to take note of any that are unusually small: Their correspondingcolumns in V are linear combinations of x’s which are insensitive to your data. In fact, you may then wish to zero these wj ’s, to reduce the number of free parameters in the fit. These matters are discussed more fully in Chapter 15. Constructing an Orthonormal Basis Suppose that you have N vectors in an M-dimensional vector space, with N ≤ M. Then the N vectors span some subspace of the full vector space. Often you want to construct an orthonormal set of N vectors that span the same subspace. The textbook way to do this is by Gram-Schmidt orthogonalization, starting with one vector and then expanding the subspace one dimension at a time. Numerically, however, because of the build-up of roundoff errors, naive Gram-Schmidt orthogonalization is terrible. The right way to construct an orthonormal basis for a subspace is by SVD: Form an M × N matrix A whose N columns are your vectors. Run the matrix through svdcmp. The columns of the matrix U (which in fact replaces A on output from svdcmp) are your desired orthonormal basis vectors. You might also want to check the output wj ’s for zero values. If any occur, then the spanned subspace was not, in fact, N dimensional; the columns of U corresponding to zero wj ’s should be discarded from the orthonormal basis set. (QR factorization, discussed in §2.10, also constructs an orthonormal basis, see [5].) Approximation of Matrices Note that equation (2.6.1) can be rewritten to express any matrix Aij as a sum of outer products of columns of U and rows of VT , with the “weighting factors” being the singular values wj , Aij = N k=1 wk UikVjk (2.6.13)
2.6 Singular Value Decomposition 67 If you ever encounter a situation where most of the singular values wi of a matrix A are very small,then A will be well-approximated by only a few terms in the sum(2.6.13).This means that you have to store only a few columns of U and V(the same k ones)and you will be able to recover,with good accuracy,the whole matrix. Note also that it is very efficient to multiply such an approximated matrix by a vector x:You just dot x with each of the stored columns of V.multiply the resulting scalar by the corresponding w,and accumulate that multiple of the corresponding column of U.If your matrix is approximated by a small number K of singular values,then this computation of A.x takes only about K(M+N)multiplications. instead of MN for the full matrix. SVD Algorithm Here is the algorithm for constructing the singular value decomposition of any matrix.See $11.2-811.3,and also [4-51,for discussion relating to the underlying , 鱼 100 method from NUMERICAL RECIPES I 18881892 #include #include "nrutil.h" (Nort void svdcmp(float a,int m,int n,float w[],float **v) America server computer, to make one paper /Cambridge University Press. THE Given a matrix a[1..m][1..n],this routine computes its singular value decomposition,A U.W.VT.The matrix U replaces a on output.The diagonal matrix of singular values W is out- ART put as a vector w[1..n].The matrix V(not the transpose VT)is output as v[1..n][1..n] Programs float pythag(float a,float b); int flag,i,its,j,jj,k,1,nm; float anorm,c,f,g,h,s,scale,x,y,Z,*rvi; rvi-vector(1,n); g=scale=anorm=0.0; Householder reduction to bidiagonal form to dir for(i=1:1<=n:1++){ 1=1+1; rv1[i]=scale*gi g=s=scale=0.0: rectcustsen OF SCIENTIFIC COMPUTING (ISBN 0-521 1f(1<=m)[ for (k=i;k<=m;k++)scale +fabs(a[k][i]); if (scale){ for(k=1;k<=m;k++)[ a[k][i]/scale; v@cambridge.org 1988-1992 by Numerical Recipes s+=a[k][1]*a[k][i]; Numerical Recipes books or -43108-5 f=a[i][i]; g =-SIGN(sqrt(s),f); h=f*g-s; a[i][i]=f-g; Software. for(j=1;j<=n;j++){ for(s=0.0,k=i;k<=m;k+)s+a[k][1]*a[k][j]; (outside North America) f=s/hi for (k=i;k<=m;k++)a[k][i]f*a[k][i]; for (k=i;k<=m;k++)a[k][i]*scale; w[i]=scale *g; g=s=scale=0.0; if (i <m &k i !n){ for (k=l;k<=n;k++)scale +fabs(a[i][k]); if (scale){
2.6 Singular Value Decomposition 67 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). If you ever encounter a situation where most of the singular values wj of a matrix A are very small, then A will be well-approximated by only a few terms in the sum (2.6.13). This means that you have to store only a few columns of U and V (the same k ones) and you will be able to recover, with good accuracy, the whole matrix. Note also that it is very efficient to multiply such an approximated matrix by a vector x: You just dot x with each of the stored columns of V, multiply the resulting scalar by the corresponding wk, and accumulate that multiple of the corresponding column of U. If your matrix is approximated by a small number K of singular values, then this computation of A · x takes only about K(M + N) multiplications, instead of MN for the full matrix. SVD Algorithm Here is the algorithm for constructing the singular value decomposition of any matrix. See §11.2–§11.3, and also [4-5], for discussion relating to the underlying method. #include #include "nrutil.h" void svdcmp(float **a, int m, int n, float w[], float **v) Given a matrix a[1..m][1..n], this routine computes its singular value decomposition, A = U ·W ·V T . The matrix U replaces a on output. The diagonal matrix of singular values W is output as a vector w[1..n]. The matrix V (not the transpose V T ) is output as v[1..n][1..n]. { float pythag(float a, float b); int flag,i,its,j,jj,k,l,nm; float anorm,c,f,g,h,s,scale,x,y,z,*rv1; rv1=vector(1,n); g=scale=anorm=0.0; Householder reduction to bidiagonal form. for (i=1;i<=n;i++) { l=i+1; rv1[i]=scale*g; g=s=scale=0.0; if (i <= m) { for (k=i;k<=m;k++) scale += fabs(a[k][i]); if (scale) { for (k=i;k<=m;k++) { a[k][i] /= scale; s += a[k][i]*a[k][i]; } f=a[i][i]; g = -SIGN(sqrt(s),f); h=f*g-s; a[i][i]=f-g; for (j=l;j<=n;j++) { for (s=0.0,k=i;k<=m;k++) s += a[k][i]*a[k][j]; f=s/h; for (k=i;k<=m;k++) a[k][j] += f*a[k][i]; } for (k=i;k<=m;k++) a[k][i] *= scale; } } w[i]=scale *g; g=s=scale=0.0; if (i <= m && i != n) { for (k=l;k<=n;k++) scale += fabs(a[i][k]); if (scale) {
68 Chapter 2. Solution of Linear Algebraic Equations for (k=l;k=1;i--)[ Accumulation of right-hand transformations. if (i =1;i--){ Accumulation of left-hand transformations. Copyright (C) 1=1+1; g=w[1]; for(j=1;j=1;k--){ Diagonalization of the bidiagonal form:Loop over for(1ts=1;1tg=1;1--) Test for splitting. nm-】-1; Note that rv1[1]is always zero if ((float)(fabs(rv1[1])+anorm)==anorm) flag=0; break; if ((float)(fabs(w[nm])+anorm)==anorm)break; if (flag){ c=0.0; Cancellation of rv1[1],if 11. s=1.0; for (i=l;i<-k;i++){
68 Chapter 2. Solution of Linear Algebraic Equations Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). for (k=l;k=1;i--) { Accumulation of right-hand transformations. if (i =1;i--) { Accumulation of left-hand transformations. l=i+1; g=w[i]; for (j=l;j=1;k--) { Diagonalization of the bidiagonal form: Loop over for (its=1;its=1;l--) { Test for splitting. nm=l-1; Note that rv1[1] is always zero. if ((float)(fabs(rv1[l])+anorm) == anorm) { flag=0; break; } if ((float)(fabs(w[nm])+anorm) == anorm) break; } if (flag) { c=0.0; Cancellation of rv1[l], if l > 1. s=1.0; for (i=l;i<=k;i++) {