正在加载图片...
Modularity and community structure in networks M.E. Newman* Department of Physics and Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109 Edited by Brian Skyrms, University of California, Irvine, CA, and approved April 19, 2006(received for review February 26, 2006) Many networks of interest in the sciences, including social net- rks, computer networks, and metabolic and regulatory net works, are found to divide naturally into communities or modules. The problem of detecting and characterizing this community struc ture is one of the outstanding issues in the study of networked systems. One highly effective approach is the optimization of the quality function known as"modularity"over the possible divisions of a network. Here l show that the modularity can be expressed in terms of the eigenvectors of a characteristic matrix for the net- work, which I call the modularity matrix, and that this expressic leads to a spectral algorithm for community detection that returns results of demonstrably higher quality than competing methods in shorter running times. I illustrate the method with appl ns to several published network data sets clustering I partitioning I modules metabolic network I social network M any systems of scientific interest can be represented as networks, sets of nodes or vertices joined in pairs by lines or edges. Examples include the internet and the worldwide web, Fig. 1. The vertices i etworks fall natural to groups or commu- metabolic networks, food webs, neural networks, communica- nities, sets of vertices(shaded)within which there are many edges, with only tion and distribution networks, and social networks. The study of a smaller number of edges between vertices of different groups networked systems has a history stretching back several centu last decade cially in the mathematical sciences, partly as a goals of the two camps that make quite different technical result of the increasing availability of accurate large-scale data approaches desirable. A typical problem in graph partitioning is analyses of these data have revealed some une d structural h of features, such as high network transitivity(1), power-law degree cessor communication. In such an application the number of distributions(2), and the existence of repeated local motifs(3); processors is usually known in advance and at least an approx- see refs. 4-6 for review imate figure for the number of tasks that each processor ca One issue that has received a considerable amount of attention handle. Thus we know the number and size of the groups into the detection and characterization of community structure in the best division of the network regardless of whether a good groups of vertices, with only sparser connections between groups division even exists; there is little point in an algorithm or (Fig. 1). The ability to detect such groups could be of significant method that fails to divide the network in some cases. practical importance. For instance, groups within the worldwide Community structure detection, by contrast, is perhaps best web might correspond to sets of web pages on related topics( 9); thought of as a data analysis technique used to shed light on the or communities(10). Merely the finding that a network contains works, internet and web data, or biochemical networks. Com- tightly knit groups at all can convey useful information: if a munity structure methods normally assume that the network of metabolic network were divided into such groups, for instance, interest divides naturally into subgroups and the experimenter's it could provide evidence for a modular view of the network's job is to find those groups. The number and size of the groups dynamics, with different groups of nodes performing different are thus determined by the network itself and not by the functions with some degree of independence (11, 12) experimenter. Moreover, community structure methods may Past work on methods for discovering groups in networks explicitly admit the possibility that no good division of the divides into two principal lines of research, both with long network exists, an outcome that is itself considered to be of histories. The first, which goes by the name of graph partitioning, interest for the light it sheds on the topology of the network has been pursued particularly in computer science and related This article focuses on community structure detection in fields, with applications in parallel computing and integrated network data sets representing real-world systems of interest circuit design, among other areas(13, 14 ). The second, identified However, both the similarities and differences between commu- names sue Ich as block modeling, hierarchical clustering, or nity ure methods and graph partitioning will motivate mmunity structure detection, has been pursued by sociologists many developments that follo and more recently by physicists, biologists, and applied mathe maticians, with applications especially to social and biological networks (7, 15, 16 Conflict of statement: No conflic It is tempting to suggest that these two lines of research are This paper was submitted directly (Track m to the PNAS office ally addressing the same question, albeit by somewhat different *E-mail: mejn@umich.edu means. There are, however, important differences between the 02006 by The National Academy of Sciences of the USA pnas. org/cgi/doi/10.1073/pnas.060160210 PNAs|June6,2006|vol.103|no.23|8577-8582Modularity and community structure in networks M. E. J. Newman* Department of Physics and Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109 Edited by Brian Skyrms, University of California, Irvine, CA, and approved April 19, 2006 (received for review February 26, 2006) Many networks of interest in the sciences, including social net￾works, computer networks, and metabolic and regulatory net￾works, are found to divide naturally into communities or modules. The problem of detecting and characterizing this community struc￾ture is one of the outstanding issues in the study of networked systems. One highly effective approach is the optimization of the quality function known as ‘‘modularity’’ over the possible divisions of a network. Here I show that the modularity can be expressed in terms of the eigenvectors of a characteristic matrix for the net￾work, which I call the modularity matrix, and that this expression leads to a spectral algorithm for community detection that returns results of demonstrably higher quality than competing methods in shorter running times. I illustrate the method with applications to several published network data sets. clustering  partitioning  modules  metabolic network  social network Many systems of scientific interest can be represented as networks, sets of nodes or vertices joined in pairs by lines or edges. Examples include the internet and the worldwide web, metabolic networks, food webs, neural networks, communica￾tion and distribution networks, and social networks. The study of networked systems has a history stretching back several centu￾ries, but it has experienced a particular surge of interest in the last decade, especially in the mathematical sciences, partly as a result of the increasing availability of accurate large-scale data describing the topology of networks in the real world. Statistical analyses of these data have revealed some unexpected structural features, such as high network transitivity (1), power-law degree distributions (2), and the existence of repeated local motifs (3); see refs. 4–6 for reviews. One issue that has received a considerable amount of attention is the detection and characterization of community structure in networks (7, 8), meaning the appearance of densely connected groups of vertices, with only sparser connections between groups (Fig. 1). The ability to detect such groups could be of significant practical importance. For instance, groups within the worldwide web might correspond to sets of web pages on related topics (9); groups within social networks might correspond to social units or communities (10). Merely the finding that a network contains tightly knit groups at all can convey useful information: if a metabolic network were divided into such groups, for instance, it could provide evidence for a modular view of the network’s dynamics, with different groups of nodes performing different functions with some degree of independence (11, 12). Past work on methods for discovering groups in networks divides into two principal lines of research, both with long histories. The first, which goes by the name of graph partitioning, has been pursued particularly in computer science and related fields, with applications in parallel computing and integrated circuit design, among other areas (13, 14). The second, identified by names such as block modeling, hierarchical clustering, or community structure detection, has been pursued by sociologists and more recently by physicists, biologists, and applied mathe￾maticians, with applications especially to social and biological networks (7, 15, 16). It is tempting to suggest that these two lines of research are really addressing the same question, albeit by somewhat different means. There are, however, important differences between the goals of the two camps that make quite different technical approaches desirable. A typical problem in graph partitioning is the division of a set of tasks between the processors of a parallel computer so as to minimize the necessary amount of interpro￾cessor communication. In such an application the number of processors is usually known in advance and at least an approx￾imate figure for the number of tasks that each processor can handle. Thus we know the number and size of the groups into which the network is to be split. Also, the goal is usually to find the best division of the network regardless of whether a good division even exists; there is little point in an algorithm or method that fails to divide the network in some cases. Community structure detection, by contrast, is perhaps best thought of as a data analysis technique used to shed light on the structure of large-scale network data sets, such as social net￾works, internet and web data, or biochemical networks. Com￾munity structure methods normally assume that the network of interest divides naturally into subgroups and the experimenter’s job is to find those groups. The number and size of the groups are thus determined by the network itself and not by the experimenter. Moreover, community structure methods may explicitly admit the possibility that no good division of the network exists, an outcome that is itself considered to be of interest for the light it sheds on the topology of the network. This article focuses on community structure detection in network data sets representing real-world systems of interest. However, both the similarities and differences between commu￾nity structure methods and graph partitioning will motivate many of the developments that follow. Conflict of interest statement: No conflicts declared. This paper was submitted directly (Track II) to the PNAS office. *E-mail: mejn@umich.edu. © 2006 by The National Academy of Sciences of the USA Fig. 1. The vertices in many networks fall naturally into groups or commu￾nities, sets of vertices (shaded) within which there are many edges, with only a smaller number of edges between vertices of different groups. www.pnas.orgcgidoi10.1073pnas.0601602103 PNAS  June 6, 2006  vol. 103  no. 23  8577– 8582 APPLIED MATHEMATICS
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有