正在加载图片...
8 The scalar form of SVD is expressed in equation 3. XVi=oidi The mathematical intuition behind the construction of the matrix form is that we want to express all n scalar equations in just one equation.It is easiest to understand this process graphically.Drawing the matrices of equation 3 looks likes the following. m positive m We can construct three new matrices V,U and E.All singular values are first rank-ordered >..>or,and the corre sponding vectors are indexed in the same rank order.Each pair of associated vectors and is stacked in the column along their respective matrices.The corresponding singular value o;is placed along the diagonal(theh position)of This generates the equation XV UE,which looks like the following. nx m The matrices V and U are mx m and nxn matrices respectively and is a diagonal matrix with a few non-zero values(repre- sented by the checkerboard)along its diagonal.Solving this single matrix equation solves all n"value"form equations. FIG.4 Construction of the matrix form of SVD(Equation 4)from the scalar form(Equation 3). where a and b are column vectors and k is a scalar con- There is a funny symmetry to SVD such that we can define a stant.The set 1,72,...,m}is analogous to a and the set similar quantity -the row space. {2,...n}is analogous to b.What is unique though is that 1,v2,....m}and [,d2,...,fin}are orthonormal sets XV=ZU of vectors which span an m or n dimensional space,respec- tively.In particular,loosely speaking these sets appear to span (xv)7 (ZU) all possible“inputs”(i.e.a)and“outputs”(i.e.b).Can we VIXT =UE formalize the view that (1,v2....,n}and [a,d2,...n} VTXT=L span all possible"inputs'”and“outputs'"? We can manipulate Equation 4 to make this fuzzy hypothesis where we have defined Z=UTE.Again the rows of VT (or more precise. the columns of V)are an orthonormal basis for transforming XT into Z.Because of the transpose on X,it follows that V X UEVT is an orthonormal basis spanning the row space of X.The UTX EVT row space likewise formalizes the notion of what are possible “inputs”into an arbitrary matrix. UX=Z We are only scratching the surface for understanding the full where we have defined Z=EVT.Note that the previous implications of SVD.For the purposes of this tutorial though, columns {d2,...,n}are now rows in U.Comparing this we have enough information to understand how PCA will fall equation to Equation 1,2,...,in perform the same role within this framework. as (1,p2.....fm}.Hence,U is a change of basis from X to Z.Just as before,we were transforming column vectors,we can again infer that we are transforming column vectors.The fact that the orthonormal basis UT(or P)transforms column vectors means that UT is a basis that spans the columns of X. C.SVD and PCA Bases that span the columns are termed the column space of X.The column space formalizes the notion of what are the It is evident that PCA and SVD are intimately related.Let us possible“outputs”of any matrix.. return to the original m x n data matrix X.We can define a8 The scalar form of SVD is expressed in equation 3. Xvˆi = σiuˆi The mathematical intuition behind the construction of the matrix form is that we want to express all n scalar equations in just one equation. It is easiest to understand this process graphically. Drawing the matrices of equation 3 looks likes the following. We can construct three new matrices V, U and Σ. All singular values are first rank-ordered σ1˜ ≥ σ2˜ ≥ ... ≥ σr˜ , and the corre￾sponding vectors are indexed in the same rank order. Each pair of associated vectors vˆi and uˆi is stacked in the i th column along their respective matrices. The corresponding singular value σi is placed along the diagonal (the iith position) of Σ. This generates the equation XV = UΣ, which looks like the following. The matrices V and U are m×m and n×n matrices respectively and Σ is a diagonal matrix with a few non-zero values (repre￾sented by the checkerboard) along its diagonal. Solving this single matrix equation solves all n “value” form equations. FIG. 4 Construction of the matrix form of SVD (Equation 4) from the scalar form (Equation 3). where a and b are column vectors and k is a scalar con￾stant. The set {vˆ1,vˆ2,...,vˆm} is analogous to a and the set {uˆ 1,uˆ 2,...,uˆ n} is analogous to b. What is unique though is that {vˆ1,vˆ2,...,vˆm} and {uˆ 1,uˆ 2,...,uˆ n} are orthonormal sets of vectors which span an m or n dimensional space, respec￾tively. In particular, loosely speaking these sets appear to span all possible “inputs” (i.e. a) and “outputs” (i.e. b). Can we formalize the view that {vˆ1,vˆ2,...,vˆn} and {uˆ 1,uˆ 2,...,uˆ n} span all possible “inputs” and “outputs”? We can manipulate Equation 4 to make this fuzzy hypothesis more precise. X = UΣV T U TX = ΣV T U TX = Z where we have defined Z ≡ ΣV T . Note that the previous columns {uˆ 1,uˆ 2,...,uˆ n} are now rows in U T . Comparing this equation to Equation 1, {uˆ 1,uˆ 2,...,uˆ n} perform the same role as {pˆ 1,pˆ 2,...,pˆ m}. Hence, U T is a change of basis from X to Z. Just as before, we were transforming column vectors, we can again infer that we are transforming column vectors. The fact that the orthonormal basis U T (or P) transforms column vectors means that U T is a basis that spans the columns of X. Bases that span the columns are termed the column space of X. The column space formalizes the notion of what are the possible “outputs” of any matrix. There is a funny symmetry to SVD such that we can define a similar quantity - the row space. XV = ΣU (XV) T = (ΣU) T V TX T = U TΣ V TX T = Z where we have defined Z ≡ U TΣ. Again the rows of V T (or the columns of V) are an orthonormal basis for transforming X T into Z. Because of the transpose on X, it follows that V is an orthonormal basis spanning the row space of X. The row space likewise formalizes the notion of what are possible “inputs” into an arbitrary matrix. We are only scratching the surface for understanding the full implications of SVD. For the purposes of this tutorial though, we have enough information to understand how PCA will fall within this framework. C. SVD and PCA It is evident that PCA and SVD are intimately related. Let us return to the original m × n data matrix X. We can define a
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有