7 • The principal components of X are_中国高校课件下载中心

点击下载：北京大学：《模式识别》课程教学资源（参考资料）A Tutorial on Principal Component Analysis

正在加载图片...

1 The principal components of X are the eigenvectors of ·Xll=o Cx=IXXT These properties are both proven in Theorem 5.We now have .Theh diagonal value of Cy is the variance of X along all of the pieces to construct the decomposition.The scalar Pi. version of singular value decomposition is just a restatement of the third definition. In practice computing PCA of a data set X entails (1)subtract- X0:=oǘ； (3) ing off the mean of each measurement type and(2)computing the eigenvectors of Cx.This solution is demonstrated in Mat- This result says a quite a bit.X multiplied by an eigen- lab code included in Appendix B. vector of XTX is equal to a scalar times another vector. The set of eigenvectors ,2,...}and the set of vec- tors [i,d2,...,r}are both orthonormal sets or bases in r- dimensional space. VI.A MORE GENERAL SOLUTION USING SVD We can summarize this result for all vectors in one matrix multiplication by following the prescribed construction in Fig- This section is the most mathematically involved and can be ure 4.We start by constructing a new diagonal matrix E. skipped without much loss of continuity.It is presented solely for completeness.We derive another algebraic solution for PCA and in the process,find that PCA is closely related to singular value decomposition (SVD).In fact,the two are so intimately related that the names are often used interchange- 》三 0 ably.What we will see though is that SVD is a more general method of understanding change of basis. We begin by quickly deriving the decomposition.In the fol- 0 lowing section we interpret the decomposition and in the last where or≥oz≥..≥or are the rank-ordered set of singu section we relate these results to PCA. lar values.Likewise we construct accompanying orthogonal matrices, V=[12.m A.Singular Value Decomposition U=dd...da Let X be an arbitrary n x m matrix4 and XTX be a rank r, where we have appended an additional(m-r)and(n-r)or- thonormal vectors to"fill up"the matrices for V and U respec- square,symmetric m x m matrix.In a seemingly unmotivated tively (i.e.to deal with degeneracy issues).Figure 4 provides fashion,let us define all of the quantities of interest. a graphical representation of how all of the pieces fit together to form the matrix version of SVD. 1,V2,...,v is the set of orthonormal m x 1 eigen- XV=UZ vectors with associated eigenvalues ,2,...,for the symmetric matrix X7X. where each column of V and U perform the scalar version of the decomposition(Equation 3).Because V is orthogonal,we (XTX):=λ， can multiply both sides by V-1-VT to arrive at the final form of the decomposition. ·oi≡√入iare positive real and termed the singular val-. X=UZVT (4) ues. Although derived without motivation,this decomposition is .[d2,...,f}is the set of n x 1 vectors defined by quite powerful.Equation 4 states that any arbitrary matrix X d三X1, can be converted to an orthogonal matrix,a diagonal matrix and another orthogonal matrix (or a rotation,a stretch and a The final definition includes two new and unexpected proper- second rotation).Making sense of Equation 4 is the subject of the next section. ties. 1 ifi=i 0 otherwise B.Interpreting SVD The final form of SVD is a concise but thick statement.In- stead let us reinterpret Equation 3 as 4 Notice that in this section only we are reversing convention frommnto nx m.The reason for this derivation will become clear in section 6.3. Xa=kb7 • The principal components of X are the eigenvectors of CX = 1 nXXT . • The i th diagonal value of CY is the variance of X along pi . In practice computing PCA of a data set X entails (1) subtracting off the mean of each measurement type and (2) computing the eigenvectors of CX. This solution is demonstrated in Matlab code included in Appendix B. VI. A MORE GENERAL SOLUTION USING SVD This section is the most mathematically involved and can be skipped without much loss of continuity. It is presented solely for completeness. We derive another algebraic solution for PCA and in the process, find that PCA is closely related to singular value decomposition (SVD). In fact, the two are so intimately related that the names are often used interchangeably. What we will see though is that SVD is a more general method of understanding change of basis. We begin by quickly deriving the decomposition. In the following section we interpret the decomposition and in the last section we relate these results to PCA. A. Singular Value Decomposition Let X be an arbitrary n × m matrix4 and X TX be a rank r, square, symmetric m×m matrix. In a seemingly unmotivated fashion, let us define all of the quantities of interest. • {vˆ1,vˆ2,...,vˆr} is the set of orthonormal m × 1 eigenvectors with associated eigenvalues {λ1,λ2,...,λr} for the symmetric matrix X TX. (X TX)vˆi = λivˆi • σi ≡ √ λi are positive real and termed the singular values. • {uˆ 1,uˆ 2,...,uˆr} is the set of n × 1 vectors defined by uˆi ≡ 1 σi Xvˆi . The final definition includes two new and unexpected properties. • uˆi ·uˆj = 1 if i = j 0 otherwise 4 Notice that in this section only we are reversing convention from m×n to n×m. The reason for this derivation will become clear in section 6.3. • kXvˆik = σi These properties are both proven in Theorem 5. We now have all of the pieces to construct the decomposition. The scalar version of singular value decomposition is just a restatement of the third definition. Xvˆi = σiuˆi (3) This result says a quite a bit. X multiplied by an eigenvector of X TX is equal to a scalar times another vector. The set of eigenvectors {vˆ1,vˆ2,...,vˆr} and the set of vectors {uˆ 1,uˆ 2,...,uˆr} are both orthonormal sets or bases in rdimensional space. We can summarize this result for all vectors in one matrix multiplication by following the prescribed construction in Figure 4. We start by constructing a new diagonal matrix Σ. Σ ≡           σ1˜ . . . 0 σr˜ 0 0 . . . 0           where σ1˜ ≥ σ2˜ ≥ ... ≥ σr˜ are the rank-ordered set of singular values. Likewise we construct accompanying orthogonal matrices, V = vˆ 1˜ vˆ 2˜ ... vˆm˜ U = uˆ 1˜ uˆ 2˜ ... uˆ n˜ where we have appended an additional (m−r) and (n−r) orthonormal vectors to “fill up” the matrices for V and U respectively (i.e. to deal with degeneracy issues). Figure 4 provides a graphical representation of how all of the pieces fit together to form the matrix version of SVD. XV = UΣ where each column of V and U perform the scalar version of the decomposition (Equation 3). Because V is orthogonal, we can multiply both sides by V −1 = V T to arrive at the final form of the decomposition. X = UΣV T (4) Although derived without motivation, this decomposition is quite powerful. Equation 4 states that any arbitrary matrix X can be converted to an orthogonal matrix, a diagonal matrix and another orthogonal matrix (or a rotation, a stretch and a second rotation). Making sense of Equation 4 is the subject of the next section. B. Interpreting SVD The final form of SVD is a concise but thick statement. Instead let us reinterpret Equation 3 as Xa = kb

<<向上翻页向下翻页>>

点击下载：北京大学：《模式识别》课程教学资源（参考资料）A Tutorial on Principal Component Analysis