IEEE TRANSACTIONS ON INFORMATION THEO_中国高校课件下载中心

正在加载图片...

IEEE TRANSACTIONS ON INFORMATION THEORY TABLE II:Comparison of methods for approximating the von B.Spectral Descriptor of Graphs and Its Structural Counter- Neumann graph entropy in terms of fulfilled()and missing part (x)properties. Researchers in spectral graph theory have always been inter- 9] 1 [10]Structural Information (Ours) ested in establishing a connection between the combinatorial Provable accuracy characteristics of a graph and the algebraic properties of its Scalability associated matrices.For example,the algebraic connectivity Interpretability (also known as Fiedler eigenvalue),defined as the second smallest eigenvalue of a graph Laplacian matrix,has been used to measure the robustness [16]and synchronizability [32]of graph entropy,whereas the last issue comes from spectral graphs.The magnitude of the algebraic connectivity has also clustering [28]and two-dimensional structural information been found to reflect how well connected the overall graph based clustering [3]. is [17].As another example,the Fiedler vector,defined as the eigenvector corresponding to the Fiedler eigenvalue of a graph Laplacian matrix,has been found to be a good sign of A.Approximate Computation of the von Neumann Graph the bi-partition structure of a graph [33].However,there are Entropy some other spectral descriptors that have found applications In an effort to overcome the computational inefficiency of in graph analytics,but require more structural interpretations, the von Neumann graph entropy,past works have resorted such as the heat kernel trace [34],[35]and von Neumann graph entropy. to various numerical approximations.Chen et al.[9]first Simmons et al.[36]suggest to interpret the von Neumann compute a quadratic approximation of the entropy via Taylor graph entropy as the centralization of graphs,which is very expansion,then derive two finer approximations with accuracy guarantee by spectrum-based and degree-based rescaling.re- similar to our interpretation using the structural information. spectively.Before Chen's work,the Taylor expansion is widely They derive both upper and lower bounds on the von Neumann adopted to give computationally efficient approximations [29], graph entropy in terms of graph centralization under some but there is no theoretical guarantee on the approximation hard assumptions on the range of the von Neumann graph accuracy.Following Chen's work,Choi et al.[10]propose entropy.Therefore,their results cannot be directly converted several more complicated quadratic approximations based on to accuracy guaranteed approximations of the von Neumann advanced polynomial approximation methods the superiority graph entropy for arbitrary simple graphs.By constrast,our work shows that the structural information is an accurate, of which is verified through experiments. scalable,and interpretable proxy of the von Neumann graph Besides,there is a trend to approximate spectral sums using stochastic trace estimation based approximations [30], entropy for arbitrary simple graphs.Besides,the techniques used in our proof are also quite different from [36]. the merit of which is the provable error-bounded estimation of the spectral sums.For example,Kontopoulou et al.[11 propose three randomized algorithms based on Taylor series, C.Spectrum,Detectability,and Significance of Community Chebyshev polynomials,and random projection matrices to Structure approximate the von Neumann entropy of density matrices.As Community structure is one of the most recognized char- another example,based on the stochastic Lanczos quadrature acteristics of large scale real-world graphs in which similar technique [31].Tsitsulin et al.[1]propose an efficient and nodes tend to cluster together.Thus it has found applications effective approximation technique called SLaQ to estimate in classification,recommendation.and link prediction.etc. the von Neumann entropy and other spectral descriptors for Started from the Fiedler vector,spectral algorithms have been web-scale graphs.However,the approximation error bound of widely studied for detecting the community structure in a SLaQ for the von Neumann graph entropy is not provided. graph [37].[38]because they are simple and theoretically The disadvantages of such stochastic approximations are also warranted.Cheeger's inequality hc v2u2 bounds obvious;their computational efficiency depends on the number the conductance hc of a graph G using the second smallest of random vectors used in stochastic trace estimation,and eigenvalue of the normalized Laplacian matrix.This is later they are not suitable for applications like anomaly detection generalized to multiway spectral partitioning[39]yielding the in graph streams and entropy-driven network design. higher-order Cheeger inequalities学≤pc(k)≤O(k2), The comparison of methods for approximating the von for each k,where uk is the k-th smallest eigenvalue of Neumann graph entropy is presented in Table II.One of the the normalized Lapalcian matrix and pc(k)is the k-way common drawbacks of the aforementioned methods is the expansion constant.Since both hc and Pc(k)measure the lack of interpretability,that is,none of these methods provide significance of community structure,the graph spectrum is enough evidence to interpret this spectrum based entropy closely related to the community structure. measure in terms of structural patterns.By contrast,as a The coupling between graph spectrum and community good proxy of the von Neumann graph entropy,the structural structure has been empirically validated in [37]where New- information offers us the intuition that the spectrum based man found that if the second smallest eigenvalue A2 of the entropy measure is closely related to the degree heterogeneity Laplacian matrix is well separated from the eigenvalues above of graphs. it,the spectral clustering based on Lapalcian matrix oftenIEEE TRANSACTIONS ON INFORMATION THEORY 4 TABLE II: Comparison of methods for approximating the von Neumann graph entropy in terms of fulfilled (✓) and missing (✗) properties. [9] [1] [10] Structural Information (Ours) Provable accuracy ✓ ✗ ✗ ✓ Scalability ✓ ✓ ✗ ✓ Interpretability ✗ ✗ ✗ ✓ graph entropy, whereas the last issue comes from spectral clustering [28] and two-dimensional structural information based clustering [3]. A. Approximate Computation of the von Neumann Graph Entropy In an effort to overcome the computational inefficiency of the von Neumann graph entropy, past works have resorted to various numerical approximations. Chen et al. [9] first compute a quadratic approximation of the entropy via Taylor expansion, then derive two finer approximations with accuracy guarantee by spectrum-based and degree-based rescaling, respectively. Before Chen’s work, the Taylor expansion is widely adopted to give computationally efficient approximations [29], but there is no theoretical guarantee on the approximation accuracy. Following Chen’s work, Choi et al. [10] propose several more complicated quadratic approximations based on advanced polynomial approximation methods the superiority of which is verified through experiments. Besides, there is a trend to approximate spectral sums using stochastic trace estimation based approximations [30], the merit of which is the provable error-bounded estimation of the spectral sums. For example, Kontopoulou et al. [11] propose three randomized algorithms based on Taylor series, Chebyshev polynomials, and random projection matrices to approximate the von Neumann entropy of density matrices. As another example, based on the stochastic Lanczos quadrature technique [31], Tsitsulin et al. [1] propose an efficient and effective approximation technique called SLaQ to estimate the von Neumann entropy and other spectral descriptors for web-scale graphs. However, the approximation error bound of SLaQ for the von Neumann graph entropy is not provided. The disadvantages of such stochastic approximations are also obvious; their computational efficiency depends on the number of random vectors used in stochastic trace estimation, and they are not suitable for applications like anomaly detection in graph streams and entropy-driven network design. The comparison of methods for approximating the von Neumann graph entropy is presented in Table II. One of the common drawbacks of the aforementioned methods is the lack of interpretability, that is, none of these methods provide enough evidence to interpret this spectrum based entropy measure in terms of structural patterns. By contrast, as a good proxy of the von Neumann graph entropy, the structural information offers us the intuition that the spectrum based entropy measure is closely related to the degree heterogeneity of graphs. B. Spectral Descriptor of Graphs and Its Structural Counterpart Researchers in spectral graph theory have always been interested in establishing a connection between the combinatorial characteristics of a graph and the algebraic properties of its associated matrices. For example, the algebraic connectivity (also known as Fiedler eigenvalue), defined as the second smallest eigenvalue of a graph Laplacian matrix, has been used to measure the robustness [16] and synchronizability [32] of graphs. The magnitude of the algebraic connectivity has also been found to reflect how well connected the overall graph is [17]. As another example, the Fiedler vector, defined as the eigenvector corresponding to the Fiedler eigenvalue of a graph Laplacian matrix, has been found to be a good sign of the bi-partition structure of a graph [33]. However, there are some other spectral descriptors that have found applications in graph analytics, but require more structural interpretations, such as the heat kernel trace [34], [35] and von Neumann graph entropy. Simmons et al. [36] suggest to interpret the von Neumann graph entropy as the centralization of graphs, which is very similar to our interpretation using the structural information. They derive both upper and lower bounds on the von Neumann graph entropy in terms of graph centralization under some hard assumptions on the range of the von Neumann graph entropy. Therefore, their results cannot be directly converted to accuracy guaranteed approximations of the von Neumann graph entropy for arbitrary simple graphs. By constrast, our work shows that the structural information is an accurate, scalable, and interpretable proxy of the von Neumann graph entropy for arbitrary simple graphs. Besides, the techniques used in our proof are also quite different from [36]. C. Spectrum, Detectability, and Significance of Community Structure Community structure is one of the most recognized characteristics of large scale real-world graphs in which similar nodes tend to cluster together. Thus it has found applications in classification, recommendation, and link prediction, etc. Started from the Fiedler vector, spectral algorithms have been widely studied for detecting the community structure in a graph [37], [38] because they are simple and theoretically warranted. Cheeger’s inequality µ2 2 ≤ hG ≤ √ 2µ2 bounds the conductance hG of a graph G using the second smallest eigenvalue of the normalized Laplacian matrix. This is later generalized to multiway spectral partitioning [39] yielding the higher-order Cheeger inequalities µk 2 ≤ ρG(k) ≤ O(k 2 ) √µk for each k, where µk is the k-th smallest eigenvalue of the normalized Lapalcian matrix and ρG(k) is the k-way expansion constant. Since both hG and ρG(k) measure the significance of community structure, the graph spectrum is closely related to the community structure. The coupling between graph spectrum and community structure has been empirically validated in [37] where Newman found that if the second smallest eigenvalue λ2 of the Laplacian matrix is well separated from the eigenvalues above it, the spectral clustering based on Lapalcian matrix often

<<向上翻页向下翻页>>