where Tr denotes the trace of a matrix.So the problem of minimizing RatioCut(A1,...,Ak)can be rewritten as min Tr(H'LH)subject to H'H I,H as defined in Eq.(5). A1:...Ak Similar to above we now relax the problem by allowing the entries of the matrix H to take arbitrary real values.Then the relaxed problem becomes: in Tr(H'LH)subject to H'H=1. This is the standard form of a trace minimization problem,and again a version of the Rayleigh-Ritz theorem (e.g.,see Section 5.2.2.(6)of Luitkepohl,1997)tells us that the solution is given by choosing H as the matrix which contains the first k eigenvectors of L as columns.We can see that the matrix H is in fact the matrix U used in the unnormalized spectral clustering algorithm as described in Section 4.Again we need to re-convert the real valued solution matrix to a discrete partition.As above,the standard way is to use the k-means algorithms on the rows of U.This leads to the general unnormalized spectral clustering algorithm as presented in Section 4. 5.3 Approximating Ncut Techniques very similar to the ones used for RatioCut can be used to derive normalized spectral clustering as relaxation of minimizing Ncut.In the case k=2 we define the cluster indicator vector f by vol(A) ifvi∈A vol A vol(A) (6) vol(A) ifvi∈A. Similar to above one can check that (Df)'1=0,f'Df vol(V),and f'Lf vol(V)Ncut(A,A).Thus we can rewrite the problem of minimizing Ncut by the equivalent problem min f'Lf subject to f as in (6),Df L1,fDf vol(V). (7) Again we relax the problem by allowing f to take arbitrary real values: re f'Lf subject to Df 11,f'Df vol(V). (8) Now we substitute g:=D1/2f.After substitution,the problem is g subjeet togvol(V). (9) Observe that D-1/2LD-1/2-Lsym,D/21 is the first eigenvector of Lsym,and vol(V)is a constant. Hence,Problem(9)is in the form of the standard Rayleigh-Ritz theorem,and its solution g is given by the second eigenvector of Lsym.Re-substituting f=D-1/2g and using Proposition 3 we see that f is the second eigenvector of Lrw,or equivalently the generalized eigenvector of Lu =ADu. For the case of finding k>2 clusters,we define the indicator vectors hj=(h1.j,...,hn)by hij 1/vvol(Aj)if v:E Aj (i=1,,n;j=1,,) (10) 10 otherwise 12where Tr denotes the trace of a matrix. So the problem of minimizing RatioCut(A1, . . . , Ak) can be rewritten as min A1,...,Ak Tr(H0LH) subject to H0H = I, H as defined in Eq. (5). Similar to above we now relax the problem by allowing the entries of the matrix H to take arbitrary real values. Then the relaxed problem becomes: min H∈❘n×k Tr(H0LH) subject to H0H = I. This is the standard form of a trace minimization problem, and again a version of the Rayleigh-Ritz theorem (e.g., see Section 5.2.2.(6) of L¨utkepohl, 1997) tells us that the solution is given by choosing H as the matrix which contains the first k eigenvectors of L as columns. We can see that the matrix H is in fact the matrix U used in the unnormalized spectral clustering algorithm as described in Section 4. Again we need to re-convert the real valued solution matrix to a discrete partition. As above, the standard way is to use the k-means algorithms on the rows of U. This leads to the general unnormalized spectral clustering algorithm as presented in Section 4. 5.3 Approximating Ncut Techniques very similar to the ones used for RatioCut can be used to derive normalized spectral clustering as relaxation of minimizing Ncut. In the case k = 2 we define the cluster indicator vector f by fi = q vol(A) vol A if vi ∈ A − qvol(A) vol(A) if vi ∈ A. (6) Similar to above one can check that (Df) 0✶ = 0, f 0Df = vol(V ), and f 0Lf = vol(V ) Ncut(A, A). Thus we can rewrite the problem of minimizing Ncut by the equivalent problem min A f 0Lf subject to f as in (6), Df ⊥ ✶, f0Df = vol(V ). (7) Again we relax the problem by allowing f to take arbitrary real values: min f∈❘n f 0Lf subject to Df ⊥ ✶, f0Df = vol(V ). (8) Now we substitute g := D1/2f. After substitution, the problem is min g∈❘n g 0D−1/2LD−1/2 g subject to g ⊥ D1/2✶, kgk 2 = vol(V ). (9) Observe that D−1/2LD−1/2 = Lsym, D1/2✶ is the first eigenvector of Lsym, and vol(V ) is a constant. Hence, Problem (9) is in the form of the standard Rayleigh-Ritz theorem, and its solution g is given by the second eigenvector of Lsym. Re-substituting f = D−1/2 g and using Proposition 3 we see that f is the second eigenvector of Lrw, or equivalently the generalized eigenvector of Lu = λDu. For the case of finding k > 2 clusters, we define the indicator vectors hj = (h1,j , . . . , hn,j ) 0 by hi,j = ( 1/ p vol(Aj ) if vi ∈ Aj 0 otherwise (i = 1, . . . , n; j = 1, . . . , k). (10) 12