V ● V2k+1 V3V3+1 Vak Figure 2:The cockroach graph from Guattery and Miller(1998). Then we set the matrix H as the matrix containing those k indicator vectors as columns.Observe that H'H=I,hDh;=1,and hLhi cut(Ai,A)/vol(Ai).So we can write the problem of minimizing Ncut as min Tr(H'LH)subject to HD=I.H as in (10). A1,....Ak Relaxing the discreteness condition and substituting T=D1/2H we obtain the relaxed problem TeTr(TD-LD-/T)subject to TT-1. (11) Again this is the standard trace minimization problem which is solved by the matrix T which contains the first k eigenvectors of Lsym as columns.Re-substituting H=D-1/2T and using Proposition 3 we see that the solution H consists of the first k eigenvectors of the matrix Lrw,or the first k generalized eigenvectors of Lu =ADu.This yields the normalized spectral clustering algorithm according to Shi and Malik (2000). 5.4 Comments on the relaxation approach There are several comments we should make about this derivation of spectral clustering.Most im- portantly,there is no guarantee whatsoever on the quality of the solution of the relaxed problem compared to the exact solution.That is,if A1,...,Ak is the exact solution of minimizing RatioCut,and B1,...,Bk is the solution constructed by unnormalized spectral clustering,then RatioCut(B1,...,B)- RatioCut(A1,...,Ak)can be arbitrary large.Several examples for this can be found in Guattery and Miller (1998).For instance,the authors consider a very simple class of graphs called "cock- roach graphs".Those graphs essentially look like a ladder,with a few rimes removed,see Fig- ure 2.Obviously,the ideal RatioCut for k=2 just cuts the ladder by a vertical cut such that A={v1,...,Uk,v2K+1,...,v3k}and A=[k+1,...,v2k,U3k+1...,vak).This cut is perfectly balanced with A=Al=2k and cut(A,A)=2.However,by studying the properties of the second eigenvector of the unnormalized graph Laplacian of cockroach graphs the authors prove that unnormalized spectral clustering always cuts horizontally through the ladder,constructing the sets B=[v1,...,v2k}and B={2k1,...,vk).This also results in a balanced cut,but now we cut k edges instead of just 2. So RatioCut(A,A)=2/k,while RatioCut(B,B)=1.This means that compared to the optimal cut, the RatioCut value obtained by spectral clustering is k/2 times worse,that is a factor in the order of n.Several other papers investigate the quality of the clustering constructed by spectral clustering,for example Spielman and Teng (1996)(for unnormalized spectral clustering)and Kannan,Vempala,and Vetta(2004)(for normalized spectral clustering).In general it is known that efficient algorithms to approximate balanced graph cuts up to a constant factor do not exist.To the contrary,this approxi- mation problem can be NP hard itself(Bui and Jones,1992). 13Figure 2: The cockroach graph from Guattery and Miller (1998). Then we set the matrix H as the matrix containing those k indicator vectors as columns. Observe that H0H = I, h 0 iDhi = 1, and h 0 iLhi = cut(Ai , Ai)/ vol(Ai). So we can write the problem of minimizing Ncut as min A1,...,Ak Tr(H0LH) subject to H0DH = I, H as in (10) . Relaxing the discreteness condition and substituting T = D1/2H we obtain the relaxed problem min T∈❘n×k Tr(T 0D−1/2LD−1/2T) subject to T 0T = I. (11) Again this is the standard trace minimization problem which is solved by the matrix T which contains the first k eigenvectors of Lsym as columns. Re-substituting H = D−1/2T and using Proposition 3 we see that the solution H consists of the first k eigenvectors of the matrix Lrw, or the first k generalized eigenvectors of Lu = λDu. This yields the normalized spectral clustering algorithm according to Shi and Malik (2000). 5.4 Comments on the relaxation approach There are several comments we should make about this derivation of spectral clustering. Most importantly, there is no guarantee whatsoever on the quality of the solution of the relaxed problem compared to the exact solution. That is, if A1, . . . , Ak is the exact solution of minimizing RatioCut, and B1, . . . , Bk is the solution constructed by unnormalized spectral clustering, then RatioCut(B1, . . . , Bk)− RatioCut(A1, . . . , Ak) can be arbitrary large. Several examples for this can be found in Guattery and Miller (1998). For instance, the authors consider a very simple class of graphs called “cockroach graphs”. Those graphs essentially look like a ladder, with a few rimes removed, see Figure 2. Obviously, the ideal RatioCut for k = 2 just cuts the ladder by a vertical cut such that A = {v1, . . . , vk, v2k+1, . . . , v3k} and A = {vk+1, . . . , v2k, v3k+1, . . . , v4k}. This cut is perfectly balanced with |A| = |A| = 2k and cut(A, A) = 2. However, by studying the properties of the second eigenvector of the unnormalized graph Laplacian of cockroach graphs the authors prove that unnormalized spectral clustering always cuts horizontally through the ladder, constructing the sets B = {v1, . . . , v2k} and B = {v2k+1, . . . , v4k}. This also results in a balanced cut, but now we cut k edges instead of just 2. So RatioCut(A, A) = 2/k, while RatioCut(B, B) = 1. This means that compared to the optimal cut, the RatioCut value obtained by spectral clustering is k/2 times worse, that is a factor in the order of n. Several other papers investigate the quality of the clustering constructed by spectral clustering, for example Spielman and Teng (1996) (for unnormalized spectral clustering) and Kannan, Vempala, and Vetta (2004) (for normalized spectral clustering). In general it is known that efficient algorithms to approximate balanced graph cuts up to a constant factor do not exist. To the contrary, this approximation problem can be NP hard itself (Bui and Jones, 1992). 13