51 Matrix Factorizations and Direct Solution of Linear Systems s of Linear Systems. Linear Systemsmposition 51.4 Symmetric Factorization 51-19 51.5 Orthogonalization and the QR Factoriation- References 51-20 The need to solve sys of linear ence engin ession "direct solution of lir ear ms”refers and dimension of the coefficient matrix.The evolution of computers has and continues to influence the development of these strategies and has also fostered particular styles of per- turbation analysis suited to illuminating their behavior.Some general themes have become dominant,as a result;others have been pushed aside.For example,Cramer's Rule may be properly thought of as a direct solution strategy for solving linear systems:however as normally manifested,it requires a much larger number of arithmetic operations than Gauss elimination and is generally much more susceptible to the deleterious effects of rounding Most current approaches for the direct solution of a linear system,Ax= b,are patterne after Gauss elimination and favor an initial phase that partially decouples the system of equations:ze into tria 1 Find invertible matrices such that Se ..S2SA=U is triangular;then 2 a modifod right hand side SS b:and then 3.Determine the solution set to the triangular system Ux=y. The matrices S,S2,...S.are typically either row permutations of lower triangular matrices (Gauss transformations)or unitary matrices.In either case,inverses s are readily available.Evidently,A can be written as A=NU,where N=(5...S2S1)-1.A solution framework may be built around the availability of decompositions such as this: 1.Find a decomposition A =NU such that U is triangular and Ny =b is easily =b;ther set to the triangular system Uxy. 51-1
51 Matrix Factorizations and Direct Solution of Linear Systems Christopher Beattie Virginia Polytechnic Institute and State University 51.1 Perturbations of Linear Systems ................... 51-2 51.2 Triangular Linear Systems.......................... 51-5 51.3 Gauss Elimination and LU Decomposition ....... 51-7 51.4 Symmetric Factorizations ........................... 51-13 51.5 Orthogonalization and the QR Factorization .... 51-16 References..................................................... 51-20 The need to solve systems of linear equations arises often within diverse disciplines of science, engineering, and finance. The expression “direct solution of linear systems” refers generally to computational strategies that are able to produce solutions to linear systems after a predetermined number of arithmetic operations that depends only on the structure and dimension of the coefficient matrix. The evolution of computers has and continues to influence the development of these strategies and has also fostered particular styles of perturbation analysis suited to illuminating their behavior. Some general themes have become dominant, as a result; others have been pushed aside. For example, Cramer’s Rule may be properly thought of as a direct solution strategy for solving linear systems; however as normally manifested, it requires a much larger number of arithmetic operations than Gauss elimination and is generally much more susceptible to the deleterious effects of rounding. Most current approaches for the direct solution of a linear system, Ax = b, are patterned after Gauss elimination and favor an initial phase that partially decouples the system of equations: zeros are introduced systematically into the coefficient matrix, transforming it into triangular form; the resulting triangular system is easily solved. The entire process can be viewed in this way: 1. Find invertible matrices {Si} ρ i=1 such that Sρ . . . S2S1A = U is triangular; then 2. Calculate a modified right-hand side y = Sρ . . . S2S1b; and then 3. Determine the solution set to the triangular system Ux = y. The matrices S1, S2, . . . Sρ are typically either row permutations of lower triangular matrices (Gauss transformations) or unitary matrices. In either case, inverses are readily available. Evidently, A can be written as A = NU, where N = (Sρ . . . S2S1) −1 . A solution framework may be built around the availability of decompositions such as this: 1. Find a decomposition A = NU such that U is triangular and Ny = b is easily solved; 2. Solve Ny = b; then 3. Determine the solution set to the triangular system Ux = y. 51-1
51-2 Handbook of Linear Algebra 51.1 Perturbations of Linear Systems s a sma pers degrade t the accuracy o 8 stems ofter befor any within the pute nted within the internal foating errors that may be introduced in the course of computation often may be viewed in effectively as an additional contribution to this initial rer linear system for which a solution is computed will deviate slightly from the "true"linear system and it becomes of critical interest to determine whether such deviations will have a significant effect on the accuracy of the final computed result. Definitions: perturbations and A and b,respectively,the solution ix Cn satisfies the b(penthen that the pertubed perturbed systen m(A+6A(文 x)= sistent). iated with as an approximate solution to the linear system Ax b is defined as r()=b-A%. For any文∈C",the ass ociated(norm-wise)relative backward error of the linear system Ax =b (with respect to the the p-norm,for 1sps oo)is there exist 6A.&b such that (A+6A)=b+6b with I5Ap≤ellAll? p(A,b:)=min For any文∈Cm,the associated component-wise relative backward error of the linear system Ax =b is there exist 6A.6b such that (A+6A)=b+6b with 16A≤E4 w(A,b;刘=mi bl≤ebl where the absolute values and inequalities applied to vectors and matrices are interpreted component- wise:for example,BI A]means for all index pairs i,j. The (norm-wise)condition number of the linear system Ax =b(with respect to thethe p-norm.for1<p<oo)is The matrix condition number of A (with respect to the the p-norm,for 1 spsoo)is p(A)=IlAllpllA-'llp The Skeel condition number of the linear system Ax=b is comd(A,刘=AA三 xoo The Skeel matrix condition number is cond(A)=A-Al lloo
51-2 Handbook of Linear Algebra 51.1 Perturbations of Linear Systems In the computational environment afforded by current computers, the finite representation of real numbers creates a small but persistent source of errors that may on occasion severely degrade the overall accuracy of a calculation. This effect is of fundamental concern in assessing strategies for solving linear systems. Rounding errors can be introduced into the solution process for linear systems often before any calculations are performed — as soon as data are stored within the computer and represented within the internal floating point number system of the computer. Further errors that may be introduced in the course of computation often may be viewed in aggregate effectively as an additional contribution to this initial representation error. Inevitably, the linear system for which a solution is computed will deviate slightly from the “true” linear system and it becomes of critical interest to determine whether such deviations will have a significant effect on the accuracy of the final computed result. Definitions: Let A ∈ C n×n be a nonsingular matrix, b ∈ C n , and then denote by xˆ = A −1b the unique solution of the linear system Ax = b. Given data perturbations δA ∈ C n×n and δb ∈ C n to A and b, respectively, the solution perturbation, δx ∈ C n satisfies the associated perturbed linear system (A + δA)(xˆ + δx) = b + δb (presuming then that the perturbed system is consistent). For any x˜ ∈ C n , the residual vector associated with x˜ as an approximate solution to the linear system Ax = b is defined as r(x˜) = b − Ax˜. For any x˜ ∈ C n , the associated (norm-wise) relative backward error of the linear system Ax = b (with respect to the the p-norm, for 1 ≤ p ≤ ∞) is ηp(A, b; x˜) = min ε there exist δA, δb such that (A + δA)x˜ = b + δb with kδAkp ≤ εkAkp kδbkp ≤ εkbkp . For any x˜ ∈ C n , the associated component-wise relative backward error of the linear system Ax = b is ω(A, b; x˜) = min ε there exist δA, δb such that (A + δA)x˜ = b + δb with |δA| ≤ ε|A| |δb| ≤ ε|b| , where the absolute values and inequalities applied to vectors and matrices are interpreted componentwise: for example, |B| ≤ |A| means |bij| ≤ |aij| for all index pairs i, j. The (norm-wise) condition number of the linear system Ax = b (with respect to the the p-norm, for 1 ≤ p ≤ ∞) is κp(A, xˆ) = kA −1 kp kbkp kxˆkp . The matrix condition number of A (with respect to the the p-norm, for 1 ≤ p ≤ ∞) is κp(A) = kAkpkA −1 kp. The Skeel condition number of the linear system Ax = b is cond(A, xˆ) = k |A −1 | |A| |xˆ|k∞ kxˆk∞ . The Skeel matrix condition number is cond(A) = k |A −1 | |A| k∞
Matrix Factorizations and Direct Solution of Linear Systems 51-3 Facts:[Hig02].[SS90] 1.For any C",is the exact solution to any one of the following families of perturbed linear systems (A+6A)=b+6b. 2.(Rigal--Gaches Theorem)For any文∈C, (A,b:刘=Ap,+Ib可 with b+ob with data perturbations as in (1)andA and as a result 5AL=lbl=,(A,b:x刘. IAllP bllp 3.(Oettli-Prager Theorem)For any C", (A,b:刘=maxA闵+b If Ddiag)and Dding(n()theis an eact solution system (A +6A)=b+6b with 6A D1 A to the 5A≤w(A,b:x)4andbl≤w(A,b:x)l4 and no smaller constant can be used in place of w(A,b;). J6A业A+6 A is singular 西=min{ In particular,the perturbed coefficient matrix A+A is nonsingular if 5.1≤(A,x刘≤p(A)and1≤cond(A,x)≤cond(A)≤K(A), b p 8.If 6b =0 and A+6A is nonsingular,then 是≤陆 l18xllp 1 .fl6A≤d4bl≤dblp.ade<④then x2 2eKP(A)) 1-EKp(A)
Matrix Factorizations and Direct Solution of Linear Systems 51-3 Facts: [Hig02], [SS90] 1. For any x˜ ∈ C n, x˜ is the exact solution to any one of the following families of perturbed linear systems (A + δAθ)x˜ = b + δbθ, where θ ∈ C, δbθ = (θ − 1) r(x˜), δAθ = θ r(x˜)y˜ ∗ , and y˜ ∈ C n is any vector such that y˜ ∗x˜ = 1. In particular, for θ = 0, δA = 0 and δb = −r(x˜); for θ = 1, δA = r(x˜)y˜ ∗ and δb = 0. 2. (Rigal–Gaches Theorem) For any x˜ ∈ C n, ηp(A, b; x˜) = kr(x˜)kp kAkpkx˜kp + kbkp . If y˜ is the dual vector to x˜ with respect to the p-norm (y˜ ∗x˜ = ky˜kq kx˜kp = 1 with 1 p + 1 q = 1), then x˜ is an exact solution to the perturbed linear system (A + δAθ˜)x˜ = b + δbθ˜ with data perturbations as in (1) and ˜θ = kAkpkx˜kp kAkpkx˜kp+kbkp , and as a result kδAθ˜kp kAkp = kδbθ˜kp kbkp = ηp(A, b; x˜). 3. (Oettli–Prager Theorem) For any x˜ ∈ C n, ω(A, b; x˜) = max i |ri | (|A| |x˜| + |b|)i . If D1 = diag ri (|A| |x˜| + |b|)i and D2 = diag(sign(x˜)i), then x˜ is an exact solution to the perturbed linear system (A + δA)x˜ = b + δb with δA = D1 |A| D2 and δb = −D1 |b| |δA| ≤ ω(A, b; x˜)|A| and |δb| ≤ ω(A, b; x˜)|A| and no smaller constant can be used in place of ω(A, b; x˜). 4. The reciprocal of κp(A) is the smallest norm-wise relative distance of A to a singular matrix, i.e., 1 κp(A) = min kδAkp kAkp A + δA is singular . In particular, the perturbed coefficient matrix A + δA is nonsingular if kδAkp kAkp < 1 κp(A) . 5. 1 ≤ κp(A, xˆ) ≤ κp(A) and 1 ≤ cond(A, xˆ) ≤ cond(A) ≤ κ∞(A). 6. cond(A) = min {κ∞(DA)| D diagonal}. 7. If δA = 0, then kδxkp kxˆkp ≤ κp(A, xˆ) kδbkp kbkp . 8. If δb = 0 and A + δA is nonsingular, then kδxkp kxˆ + δxkp ≤ κp(A) kδAkp kAkp . 9. If kδAkp ≤ kAkp, kδbkp ≤ kbkp, and < 1 κp(A) , then kδxkp kxˆkp ≤ 2 κp(A) 1 − κp(A)
51-4 Handbook of Linear Algebra 10.If15A≤dAl,l5bl≤ebl,ande<eomd④then s x Examples: [1000999] 1.Let A= 999998 0A1= 「-9989991 999 1000 Then IlAll =A-=1999 so that K1(A)≈3.996×10.Consider associated with a solution&= 1997 Aperturbation of the right-hand side b.01 「-0.011 constitutes a relative change in the right. hand side of微t≈5.05x10-6 erodcerbd soion文+ix= 「20.97 -18.99 constituting a relative change=19.9820=(A).The bound determined by the condition number is very nearly achieved.Note that the same perturbed solution+6x could be produced by a change in the coefficient matrix -0.01 6A==- ,学aab and subdiagonal entries equal to 1 (associated with a centered difference approximation to the second derivative).Let b be a vector with a quadratic variation in entries bk=(k-1)(100-k)/10,000, Then 2(4,)≈1,but2(A)≈4.1336×103 Since the elements of b do not have an exact binary representation,the linear system that is presented to any computational algorithm will be Ax b+6b with llob2 ellbl2 where e is the unit roundoff error.For example,if the linear system data is stored in IEE single precision format The matrix condition num (A),would yield bound of (6 x 10 1336 ng the digit 2 stantially 08 to th But in fact this clus of the orithm that is used true back ht be if s s substantially if the right-hand side is changed to bk=(-1)(k-1)(100-)/10.000, which only introduces a sign variation in b.In this case .(A.文)≈(A).and the com ponents of the computed solution can be expected to lose about 4 significant digits purely on
51-4 Handbook of Linear Algebra 10. If |δA| ≤ |A|, |δb| ≤ |b|, and < 1 cond(A) , then kδxk∞ kxˆk∞ ≤ 2 cond(A, xˆ) 1 − cond(A) . Examples: 1. Let A = " 1000 999 999 998# so A −1 = " −998 999 999 −1000# . Then kAk1 = kA −1 k1 = 1999 so that κ1(A) ≈ 3.996 × 106 . Consider b = " 1999 1997# associated with a solution xˆ = " 1 1 # . A perturbation of the right-hand side δb = −0.01 0.01 constitutes a relative change in the righthand side of kδbk1 kbˆk1 ≈ 5.005 × 10−6 yet it produces a perturbed solution xˆ + δx = 20.97 −18.99 constituting a relative change kδxk1 kxˆk1 = 19.98 ≤ 20 = κ1(A) kδbk1 kbk1 . The bound determined by the condition number is very nearly achieved. Note that the same perturbed solution xˆ +δx could be produced by a change in the coefficient matrix δA = ˜ry˜ ∗ = − " −0.01 0.01# h 1 39.96 − 1 39.96 i = (1/3996) " 1 −1 −1 1# constituting a relative change kδAk1 kAk1 ≈ 2.5 × 10−7 . Then (A + δA)(xˆ + δx) = b. 2. Let n = 100 and A be tridiagonal with diagonal entries equal to −2 and all superdiagonal and subdiagonal entries equal to 1 (associated with a centered difference approximation to the second derivative). Let b be a vector with a quadratic variation in entries bk = (k − 1)(100 − k)/10,000. Then κ2(A, xˆ) ≈ 1, but κ2(A) ≈ 4.1336 × 103 . Since the elements of b do not have an exact binary representation, the linear system that is presented to any computational algorithm will be Ax = b + δb with kδbk2 ≤ kbk2, where is the unit roundoff error. For example, if the linear system data is stored in IEEE single precision format, ≈ 6 × 10−8 . The matrix condition number, κ2(A), would yield a bound of (6 × 10−8 )(4.1336 × 103 ) ≈ 2.5 × 10−4 anticipating the loss of more than 4 significant digits in solution components even if all computations were done on the stored data with no further error. However, the condition number of the linear system, κ2(A, xˆ), is substantially smaller and the predicted error for the system is roughly the same as the initial representation error ≈ 6 × 10−8 , indicating that the solution will be fairly insensitive to the consequences of rounding of the right-hand side data—assuming no further errors occur. But, in fact, this conclusion remains true even if further errors occur, if whatever computational algorithm that is used produces small backward error, as might be asserted if, say, a final residual satisfies krk2 ≤ O() kbk2. This situation changes substantially if the right-hand side is changed to bk = (−1)k (k − 1)(100 − k)/10,000, which only introduces a sign variation in b. In this case, κ2(A, xˆ) ≈ κ2(A), and the components of the computed solution can be expected to lose about 4 significant digits purely on
Matrix Factorizations and Direct Solution of Linear Systems 51-5 the basis of errors that are made in the initial representation.Additional errors made in the course of the computation can hardly be expected to improve this situation. 51.2 Triangular Linear Systems Syseamsofieroquatiosorw时iahthotmnosnmennlforometntnci识 Such may be an be solved w. tems are e the for of lin they often constitute an interm Definitions: = r an upper triangular matrix (ti=0 for i>j)or a Facts:Hig02],[GV96] 1.[GV96,pp.88-90 Algorithm 1:Row-wise forward substitution for solving lower triangular system Input:L=[∈R"xn with(为=0forki: for k=n down to 2 in steps of-1, 工k←b/ukk b1:k-1+b1:k-1-kU1:k-1.k x1←b/u1,1 2.Algorithm 1 involves as a core calculation dot products of portions of coefficient matrix rows with portions of the emerging solution vector.This can incur a performance penalty for large n from accumulation of dot products using a scalar recurrence.A "column-wise"reformulation may have better performance for large n.Algorithm 2 is such a "column-wise"formulation for upper triangular systems. 3.An efficient and reliable implementation for the solution of triangular systems is offered as part of the standard BLAs software library in xTRSz (see Chapter 92), where x=S,D,C,or Z according to whether data are single or double precision real,or single or double precision complex floating point numbers,respectively,and z=V or M according to whether a single system of equations is to be solved or multiple systems (sharing the same coefficient matrix)are to be solved,respectively
Matrix Factorizations and Direct Solution of Linear Systems 51-5 the basis of errors that are made in the initial representation. Additional errors made in the course of the computation can hardly be expected to improve this situation. 51.2 Triangular Linear Systems Systems of linear equations for which the unknowns may be solved for one at a time in sequence may be reordered to produce linear systems with triangular coefficient matrices. Such systems can be solved both with remarkable accuracy and remarkable efficiency. Triangular systems are the archetype for easily solvable systems of linear equations. As such, they often constitute an intermediate goal in strategies for solving linear systems. Definitions: A linear system of equations Tx = b with T ∈ C n×n (representing n equations in n unknowns) is a triangular system if T = [tij] is either an upper triangular matrix (tij = 0 for i > j) or a lower triangular matrix (tij = 0 for i j; b ∈ R n Output: solution vector x ∈ R n that satisfies Ux = b for k = n down to 2 in steps of −1, xk ← bk/uk,k b1:k−1 ← b1:k−1 − xkU1:k−1,k x1 ← b1/u1,1 2. Algorithm 1 involves as a core calculation dot products of portions of coefficient matrix rows with portions of the emerging solution vector. This can incur a performance penalty for large n from accumulation of dot products using a scalar recurrence. A “column-wise” reformulation may have better performance for large n. Algorithm 2 is such a “column-wise” formulation for upper triangular systems. 3. An efficient and reliable implementation for the solution of triangular systems is offered as part of the standard blas software library in xTRSz (see Chapter 92), where x=S, D, C, or Z according to whether data are single or double precision real, or single or double precision complex floating point numbers, respectively, and z=V or M according to whether a single system of equations is to be solved or multiple systems (sharing the same coefficient matrix) are to be solved, respectively
51-6 Handbook of Linear Algebra Algorithm 1 or 2 is 102 rith solvin r sy system (T+6T)=b,where |6T]0,consider T= cond(T)=5,even though x(四)=22+3≈2+0. Thus,lin ms having T as a coefficient matrix will be solved to high relative ent of both right-hand side and size of des te the poor conditioning of T(as small.However,note that cond(TT)=1+2 and (TT)=(1+e)22+(1)
51-6 Handbook of Linear Algebra 4. The solution of triangular systems using either Algorithm 1 or 2 is component-wise backward stable. In particular the computed result, x˜, produced either by Algorithm 1 or 2 in solving a triangular system, Tx = b, will be the exact result of a perturbed system (T + δT)x˜ = b, where |δT| ≤ n 1 − n |T| and is the unit roundoff error. 5. The error in the solution of a triangular system, Tx = b, using either Algorithm 1 or 2 satisfies kx˜ − xˆk∞ kxˆk∞ ≤ n cond(T, xˆ) 1 − n (cond(T) + 1). 6. If T = [tij ] is a lower triangular matrix satisfying |tii| ≥ |tij | for j ≤ i, the computed solution to the linear system Tx = b produced by either Algorithm 1 or the variant of Algorithm 2 for lower triangular systems satisfies |xˆi − x˜i | ≤ 2 i n 1 − n max j≤i |x˜j |, where ˜xi are the components of the computed solution, x˜, and ˆxi are the components of the exact solution, xˆ. Although this bound degrades exponentially with i, it shows that early solution components will be computed to high accuracy relative to those components already computed. Examples: 1. Use Algorithm 2 to solve the triangular system 1 2 −3 0 2 −6 0 0 3 x1 x2 x3 = 1 1 1 . k = 3 step: Solve for x3 = 1/3. Update right-hand side: 1 2 0 2 0 0 " x1 x2 # = 1 1 1 − (1/3) −3 −6 3 = 2 3 0 . k = 2 step: Solve for x2 = 3/2. Update right-hand side: 1 0 0 x1 = 2 3 0 − (3/2) 2 2 0 = −1 0 0 . k = 1 step: Solve for x1 = −1. 2. [Hig02, p. 156] For > 0, consider T = 1 0 0 ε ε 0 0 1 1 . Then T −1 = 1 0 0 −1 1 ε 0 1 − 1 ε 1 , and so cond(T) = 5, even though κ∞(T) = 2(2 + 1 ε ) ≈ 2 ε + O(1). Thus, linear systems having T as a coefficient matrix will be solved to high relative accuracy, independent of both right-hand side and size of , despite the poor conditioning of T (as measured by κ∞) as becomes small. However, note that cond(T T ) = 1 + 2 ε and κ∞(T T ) = (1 + ε) 2 ε ≈ 2 ε + O(1).
Matrix Factorizations and Direct Solution of Linear Systems 51-7 So.linear systems having TT as a coefficient matrix may have solutions that are sensitive to perturbations and indeed,cond(TT,)cond(TT)for any right-hand side b with b30 yielding solutions that are sensitive to perturbations for small e. 51.3 Gauss Elimination and LU Decomposition entary app os th of th h.yet it transform nation matrix M mation")is designed s into A- ally into a rtion of theth without harming eros that bave heen introdu ps.Typically,successive applications of Gauss transformation are interleaved with row intercha as producing a decon matrix and n is a row permutation of a lower triangular matrix. Definitions: For each indexk a Gau ector is 4=包044 ntri +1,....In are Ga ulti ML=I-ELel is called a Gauss transformation. For the pair of indices (i j),with ij the associated permutation matrix,is an nxn identity matrix with the i row and row interchanged.Note that IIii is the identity matrix. A matrix U E C is in row-echelon form if (1)the first nonzero entry of each row has a strictly small n than all nonzer ntries with strictly larger r x and (2)zer rows occ at th nonzero entry in each ro is called a pivot. determining eature ots occur of a onzer L∈Cmxm =0 for Facts:[GV96] 1.Let a e cr be a vector with a ent in the re entrv a.≠0.Defn the Gauss vector,0 M=I-eT introduces zeros into the last n-r entries of a Ma=[a,....r0..... MpMp-1…MA=U with U upper triangular.Each Gauss transformation M,introduces zeros into the rth column. 3.Gauss transformations are unit lower triangular matrices.They are invertible,and for the Gauss transformation.M=I-ee. M1=I+ereT
Matrix Factorizations and Direct Solution of Linear Systems 51-7 So, linear systems having T T as a coefficient matrix may have solutions that are sensitive to perturbations and indeed, cond(T T , xˆ) ≈ cond(T T ) for any right-hand side b with b3 6= 0 yielding solutions that are sensitive to perturbations for small . 51.3 Gauss Elimination and LU Decomposition Gauss elimination is an elementary approach to solving systems of linear equations, yet it still constitutes the core of the most sophisticated of solution strategies. In the k th step, a transformation matrix, Mk, (a “Gauss transformation”) is designed so as to introduce zeros into A — typically into a portion of the k th column — without harming zeros that have been introduced in earlier steps. Typically, successive applications of Gauss transformations are interleaved with row interchanges. Remarkably, this reduction process can be viewed as producing a decomposition of the coefficient matrix A = NU, where U is a triangular matrix and N is a row permutation of a lower triangular matrix. Definitions: For each index k, a Gauss vector is a vector in C n with the leading k entries equal to zero: `k = [0, . . . , 0 | {z } k , `k+1, . . . , `n] T . The entries `k+1, . . . , `n are Gauss multipliers. The related matrix Mk = I − `ke T k is called a Gauss transformation. For the pair of indices (i, j), with i ≤ j the associated permutation matrix, Πi,j is an n × n identity matrix with the i th row and j th row interchanged. Note that Πi,i is the identity matrix. A matrix U ∈ C m×n is in row-echelon form if (1) the first nonzero entry of each row has a strictly smaller column index than all nonzero entries with strictly larger row index and (2) zero rows occur at the bottom. The first nonzero entry in each row is called a pivot. The determining feature of row echelon form is that pivots occur to the left of all nonzero entries in lower rows. A matrix A ∈ C m×n has an LU decomposition if there exists a unit lower triangular matrix L ∈ C m×m (Li,j = 0 for i j) such that A = LU. Facts: [GV96] 1. Let a ∈ C n be a vector with a nonzero component in the r th entry, ar 6= 0. Define the Gauss vector, `r = [0, . . . , 0 | {z } r , ar+1 ar , . . . , an ar ] T . The associated Gauss transformation Mr = I − `re T r introduces zeros into the last n − r entries of a: Mra = [a1, . . . , ar, 0, . . . , 0]T . 2. If A ∈ C m×n with rank A = ρ ≥ 1 has ρ leading principal submatrices nonsingular, A1:r,1:r, r = 1, . . . , ρ, then there exist Gauss transformations M1, M2, . . . , Mρ so that MρMρ−1 · · · M1A = U with U upper triangular. Each Gauss transformation Mr introduces zeros into the r th column. 3. Gauss transformations are unit lower triangular matrices. They are invertible, and for the Gauss transformation, Mr = I − `re T r , M−1 r = I + `re T r
51-8 Handbook of Linear Algebra If Gauss vectors 2,....n-1 are given with 0 2 then the oduct of Ga mations Mn-1M-2 .MM is invertible and has an explicit inverse 0 0 21 1 1-1 (Mn-1Mn-2.M2M1)-1=I+keX=l3132·. 0 k=1 1 nlln2…n.n-11 5.If A E cmxn with rank A =p has p leading principal submatrices nonsingular Ai:r:r,r=1,...,p,then A has an LU decomposition:A LU,with L unit lower triangular and U upper triangular.The (i,j)entry of L:Li with i>j,is the Gauss multiplier that was used to introduce a zero into the corresponding(i,j)entry of A. 6.If AE Cmxn with rank A=m (full),then the LU decomposition is unique. 7.Let a be an arbitrary vector in C".For any index r,there is an index u>r,a permutation matrix II and a Gauss transformation Mr so that MrΠr,a=[a1,,ar-1,a,0,,0 The index u is chosen so that a0 out of the set far,r+, .,an}.fa,≠0, then u=r and II=I is a possible choice;if each element is zero,ar =ar+1= .=an =0,then u=r. =I,and M,=I is a possible choice. 8.For every matrix A E Cm with rank A =p,there exists a sequence of p in- dices41,2,,"with i≤4≤m for i=1, .p and Gauss transformations M,,Me so that MolΠpeMn-iΠ2-1 M1,A=0 angular and n row echelon form.Each pair of transformationsintroduces 0.Fur pivot. <j,=;where M =I-LreT and =(i.e.,the and j entries of are interchanged to form ) with rank A =p,there is a row permutation of A that decompo sition:PA=LU,with a permutation matrix P. t lower tri r matrix a m ro echelon form. be dn upper triangul .IIfrom Fact 8,though in general there many 11 Reducti equivalently. calculatio of an ge the s a practic cte 道力心 p into the ot lo such thatr<u sm an =max in Fact 8 are chosen in this way.the reduction proc mination with Partial Pivoting(GEPP)or, within the context of factorization,the permuted LU factorization(PLU)
51-8 Handbook of Linear Algebra 4. If Gauss vectors `1, `2, . . . , `n−1 are given with `1 = 0 `21 `31 . . . `n1 , `2 = 0 0 `32 . . . `n2 , . . . , `n−1 = 0 0 . . . 0 `n,n−1 , then the product of Gauss transformations Mn−1Mn−2 · · · M2M1 is invertible and has an explicit inverse (Mn−1Mn−2 . . . M2M1) −1 = I + nX−1 k=1 `ke T k = 1 0 . . . 0 0 `21 1 0 `31 `32 . . . 0 . . . 1 0 `n1 `n2 . . . `n,n−1 1 . 5. If A ∈ C m×n with rank A = ρ has ρ leading principal submatrices nonsingular, A1:r,1:r, r = 1, . . . , ρ, then A has an LU decomposition: A = LU, with L unit lower triangular and U upper triangular. The (i, j) entry of L: Li,j , with i > j, is the Gauss multiplier that was used to introduce a zero into the corresponding (i, j) entry of A. 6. If A ∈ C m×n with rank A = m (full), then the LU decomposition is unique. 7. Let a be an arbitrary vector in C n. For any index r, there is an index µ ≥ r, a permutation matrix Πr,µ, and a Gauss transformation Mr so that MrΠr,µa = [a1, . . . , ar−1, aµ, 0, . . . , 0 | {z } n−r ] T . The index µ is chosen so that aµ 6= 0 out of the set {ar, ar+1, . . . , an}. If ar 6= 0, then µ = r and Πr,µ = I is a possible choice; if each element is zero, ar = ar+1 = · · · = an = 0, then µ = r, Πr,µ = I, and Mr = I is a possible choice. 8. For every matrix A ∈ C m×n with rank A = ρ, there exists a sequence of ρ indices µ1, µ2, . . . , µρ with i ≤ µi ≤ m for i = 1, . . . , ρ and Gauss transformations M1, . . . , Mρ so that MρΠρ,µρMρ−1Πρ−1,µρ−1 · · · M1Π1,µ1A = U with U upper triangular and in row echelon form. Each pair of transformations MrΠr,µr introduces zeros below the r th pivot. 9. For r < i < j, Πi,jMr = MfrΠi,j , where Mfr = I − ˜`re T r and ˜`r = Πi,j`r (i.e., the i and j entries of `r are interchanged to form ˜`r). 10. For every matrix A ∈ C m×n with rank A = ρ, there is a row permutation of A that has an LU decomposition: P A = LU, with a permutation matrix P, unit lower triangular matrix L, and an upper triangular matrix U that is in row echelon form. P can be chosen as P = Πρ,µρΠρ−1,µρ−1 . . . Π1,µ1 from Fact 8, though in general there can be many other possibilities as well. 11. Reduction of A with Gauss transformations (or, equivalently, calculation of an LU factorization) must generally incorporate row interchanges. As a practical matter, these row interchanges commonly are chosen so as to bring the largest magnitude entry within the column being reduced up into the pivot location. This strategy is called “partial pivoting.” In particular, if zeros are to be introduced into the k th column below the r th row (with r ≤ k), then one seeks an index µr such that r ≤ µr ≤ m and |Aµr,k| = maxr≤i≤m |Ai,k|. When µ1, µ2, . . . , µρ in Fact 8 are chosen in this way, the reduction process is called “Gaussian Elimination with Partial Pivoting” (GEPP) or, within the context of factorization, the permuted LU factorization (PLU)
Matrix factorizations and direct solution of linear systers 51-9 12.GV96,p.115j Algorithm 3:GEPP/PLU decomposition of a rectangular matrix(outer product) InDt:AG设m Output:L E Rmxm(unit lower triangular matrix) PeRgmpw (P is represented with an index vector p such that y=Pz=) L←lm:U←0∈Rmx";p=1,2,3,,m:andr←1; fork=1 to n Findμsuch that≤≤n and A=maxrsism Ak ir.the change An.kin Ar,tn,Lpir-1 Lr.tr-1,and pe pr r+l:m.r t Ar+im.k/Ark for i=n |A←A-LU r4r+1 Algorithm 4:GEPP/PLU decomposition of a rectangular matrix(gaxpy) nput:A∈Rm Output:L ERxm(unit lower triangular matrix). UERx"(upper triangular matrix-row echelon form),and PER (permutation matrix)so that PA=LU (Pis represented with an index vector that records row interchanges In∈row r and row之rwere interchanged in step】 L- ;andr←-1; for j=I to n hange vv元 Solve the triangular system,L1.1=vir- Update v ←Vrm-Lrm.1r-1z Find u such that=maxrsismlvl ifu≠0,then Exchange var for i=l tor-1. Exchange←+Lrd Lr+1:m.r←Vr+lm/r ←r+1
Matrix Factorizations and Direct Solution of Linear Systems 51-9 12. [GV96, p. 115] Algorithm 3: GEPP/PLU decomposition of a rectangular matrix (outer product) Input: A ∈ R m×n Output: L ∈ R m×m (unit lower triangular matrix) U ∈ R m×n (upper triangular matrix - row echelon form) P ∈ R m×m (permutation matrix) so that P A = LU (P is represented with an index vector p such that y = Pz ⇔ yj = zpj ) L ← Im; U ← 0 ∈ R m×n ; p = [1, 2, 3, . . . , m]; and r ← 1; for k = 1 to n Find µ such that r ≤ µ ≤ m and |Aµ,k| = maxr≤i≤m |Ai,k| if Aµ,k 6= 0, then Exchange Aµ,k:n ↔ Ar,k:n, Lµ,1:r−1 ↔ Lr,1:r−1, and pµ ↔ pr Lr+1:m,r ← Ar+1:m,k/Ar,k Ur,k:n ← Ar,k:n for i = r + 1 to m for j = k + 1 to n Ai,j ← Ai,j − Li,rUr,j r ← r + 1 Algorithm 4: GEPP/PLU decomposition of a rectangular matrix (gaxpy) Input: A ∈ R m×n Output: L ∈ R m×m (unit lower triangular matrix), U ∈ R m×n (upper triangular matrix - row echelon form), and P ∈ R m×m (permutation matrix) so that P A = LU (P is represented with an index vector π that records row interchanges πr = µ means row r and row µ ≥ r were interchanged in step r) L ← Im ∈ R m×m; U ← 0 ∈ R m×n ; and r ← 1; for j = 1 to n v ← A1:m,j if r > 1, then for i = 1 to r − 1, Exchange vi ↔ vπi Solve the triangular system, L1:r−1,1:r−1 · z = v1:r−1 U1:r−1,j ← z Update vr:m ← vr:m − Lr:m,1:r−1 · z Find µ such that |vµ| = maxr≤i≤m |vi| if vµ 6= 0, then πr ← µ Exchange vµ ↔ vr for i = 1 to r − 1, Exchange Lµ,i ↔ Lr,i Lr+1:m,r ← vr+1:m/vr Ur,j ← vr r ← r + 1
51-10 Handbook of Linear Algebra 出 tial f of those nd lied fo rt in full column ank settings (rank ch,GEPP/PLU n) s are skigu ing that no pivots are encountered and that no redu tion stor ed. 14.Both Algorithms 3 and 4 require ap m(m-o)arithmetic operations (with rank)Algorithm core calculation the updating of a submatrix having ever diminishing size.For large matrix dimension,the contents of this submatrix A. may be widely scattered through computer memory and a performance penalty can occur in gathering the data for computation(which can be costly relative to the number of arithmetic operations that are performed with that data).Algorithm 4 is a reorganization that avoids excess data motion by delay ing updates to columns until the step within which they have zeros introduced.This forces modifications to the matrix entries to be made just one column at a time and the necessary data motion can be more efficient. 15.The overhead associated with partial pivoting comes from lines beginning "Find u such that..."in Algorithms 3 and 4,involving a net mn- each of which are comparable in computational effort to a floating point subtraction. Typically this adds negligible overhead relative to the core complexity of (p)arith metic operations.Other strategies for avoiding the adverse effects of small piv Some are more aggressive than partial pivoting in producing the largest possible pivo .others are more restraine (and s eape ntro the bring in mn in row =mAxA Gaus elimina tha <U< mination with complete pivoting produce a unit lower triangular matrix L E Rmxm,an upper triangular matrix U E Rmxn and two permutation matrices,P and Q,so that PAQ=LU.This strategy can re quire O(mn -)floating point comparisons,potentially adding now nonnegligible overhead to the c ore arithmetic requirements.overhead associated with data motion can become significant as well.The added stability that complete pivoting provides is rarely perceived to be worth the additional overhead. (b)"Threshold pivoting"identifies pivot candidates in each step that achieve a sig- nificant(predetermined)fraction of the magnitude of the pivot that would have been used in that step for partial pivoting:Consider all f such that r<<m and lA,k≥T:maxr≤i≤mlAi,,wherer∈(0,l)is a given threshold.This allows pivots to be chosen on the basis of other criteria such as influence on sparsity while stil providing protection from instability.T can often be chosen quite small (T=0.1 0n53. path potin (C)“ d portion of the y al and owing largest mag entr This an en having largest magr 1 ow and ead yet ty plete pivot 16.An efficient and reliable implementation of the GEPP/PLU factorization is offered in the LAPACK software library as xGETRF;solving associated linear systems may be done with xCESV(see Section 93.2)
51-10 Handbook of Linear Algebra 13. The condition for skipping reduction steps (that is, when Aµ,k = 0 in Algorithm 3 or when vµ = 0 in Algorithm 4) indicates deficiency of column rank and the potential for an infinite number of solutions. These conditions are sensitive to rounding errors that may occur in the calculation of those columns and as such, GEPP/PLU is applied for the most part in full column rank settings (rank A = n), guaranteeing that no zero pivots are encountered and that no reduction steps are skipped. 14. Both Algorithms 3 and 4 require approximately 2 3 ρ 3+ρm(n−ρ)+ρn(m−ρ) arithmetic operations (with rank A = ρ). Algorithm 3 involves as a core calculation the updating of a submatrix having ever diminishing size. For large matrix dimension, the contents of this submatrix, Ar+1:m,k+1:n, may be widely scattered through computer memory and a performance penalty can occur in gathering the data for computation (which can be costly relative to the number of arithmetic operations that are performed with that data). Algorithm 4 is a reorganization that avoids excess data motion by delaying updates to columns until the step within which they have zeros introduced. This forces modifications to the matrix entries to be made just one column at a time and the necessary data motion can be more efficient. 15. The overhead associated with partial pivoting comes from lines beginning “Find µ such that . . . ” in Algorithms 3 and 4, involving a net mn − n 2 2 floating point comparison each of which are comparable in computational effort to a floating point subtraction. Typically this adds negligible overhead relative to the core complexity of O(ρ 3 ) arithmetic operations. Other strategies for avoiding the adverse effects of small pivots exist. Some are more aggressive than partial pivoting in producing the largest possible pivot (consequently have higher overhead), others are more restrained (and so, are cheaper). (a) “Complete pivoting” uses both row and column permutations to bring in the largest possible pivot: If zeros are to be introduced into the k th column in row entries r + 1 to m, then one seeks indices µ and ν such that r ≤ µ ≤ m and k < ν ≤ n such that |Aµ,ν| = max r≤i≤m k<j≤n |Ai,j |. Gauss elimination with complete pivoting produces a unit lower triangular matrix L ∈ R m×m, an upper triangular matrix U ∈ R m×n, and two permutation matrices, P and Q, so that P AQ = LU. This strategy can require O( mn2 2 − n 3 6 ) floating point comparisons, potentially adding now nonnegligible overhead to the core arithmetic requirements. Overhead associated with data motion can become significant as well. The added stability that complete pivoting provides is rarely perceived to be worth the additional overhead. (b) “Threshold pivoting” identifies pivot candidates in each step that achieve a significant (predetermined) fraction of the magnitude of the pivot that would have been used in that step for partial pivoting: Consider all ˆµ such that r ≤ µˆ ≤ m and |Aµ,k ˆ | ≥ τ · maxr≤i≤m |Ai,k|, where τ ∈ (0, 1) is a given threshold. This allows pivots to be chosen on the basis of other criteria such as influence on sparsity while still providing some protection from instability. τ can often be chosen quite small (τ = 0.1 or τ = 0.025 are typical values). See Section 53.5. (c) “Rook pivoting” searches the unreduced portion of the matrix for a pivot by tracing a path alternately along columns and rows (“rook”-like) following largest magnitude entries until an entry having largest magnitude in both its row and column is discovered. This approach is more aggressive than partial pivoting, yet typically has overhead that is a small multiple of that for partial pivoting. For some matrices overhead can be comparable to complete pivoting. 16. An efficient and reliable implementation of the GEPP/PLU factorization is offered in the lapack software library as xGETRF; solving associated linear systems may be done with xGESV (see Section 93.2)