Community Search over Big Graphs: Models, algorithms, and Opportunities Xin Huang*, Laks V.S. Lakshmanan, Jianliang Xu UNiversity of british Columbia, Vancouver, canada Hong Kong Baptist University, Hong Kong, China xinhuang@comp. hkbu. edu. hk, laks@ubc. cS. ca, xul@comp. hkbu. edu. hk UBC 浸會 历 1956 BAPTIS
Community Search over Big Graphs: Models, Algorithms, and Opportunities Xin Huang∗† , Laks V.S. Lakshmanan∗ , Jianliang Xu† ∗University of British Columbia, Vancouver, Canada †Hong Kong Baptist University, Hong Kong, China xinhuang@comp.hkbu.edu.hk, laks@ubc.cs.ca, xujl@comp.hkbu.edu.hk
Tutorial outline ntroduction, Motivations, and challenges Networks Community Detection Community Search(4 Parts Densely-connected community search Attributed community search Social circle discovery Querying geo-social groups Future Work Open problems
Tutorial Outline • Introduction, Motivations, and Challenges • Networks & Community Detection • Community Search (4 Parts) – Densely-connected community search – Attributed community search – Social circle discovery – Querying geo-social groups • Future Work & Open Problems 2
Networks Networks are everywhere(e.g. chemistry biology social networks the Web, etc
• Networks are everywhere (e.g. chemistry, biology, social networks, the Web, etc.) 3 Networks
Communities Communities naturally exist in networks. Blogosphere
Communities • Communities naturally exist in networks. Blogosphere 4
Community structure Community structure: Nodes with a shared latent property, densely inter-connected Many reasons for communities to be formed Social Networks Citation Networks World wide web biological Networks
• Community structure: Nodes with a shared latent property, densely inter-connected . • Many reasons for communities to be formed: Social Networks Citation Networks World Wide Web Biological Networks 5 Community Structure
Basis of community Formation The strength of weak ties [Mark Granovetter, 1973] and the models of small-world [Strogatz and Watts, Nature 981 both suggest Strong ties are well embedded in the network Weak ties span long ranges Given a network how do we find all communities?
• The strength of weak ties [Mark Granovetter,1973] and the models of small-world [Strogatz and Watts, Nature’98] both suggest – Strong ties are well embedded in the network – Weak ties span long ranges • Given a network, how do we find all communities? 6 Basis of Community Formation
Community detection Q: Given a network how do we find all communities? A: Find weak ties and identify communities Betweenness centrality [Girvan and Newman, PNAS 02]. Modularity [Newman, PNAS 061 Graph partitioning methods [Karypis and Kumar, SISC08 SFI collaboration network[Newman
• Q: Given a network, how do we find all communities? • A: Find weak ties and identify communities – Betweenness centrality [Girvan and Newman, PNAS’02], – Modularity [Newman, PNAS’06] – Graph partitioning methods [Karypis and Kumar, SISC’08] SFI collaboration network [Newman] 7 Community Detection
[Palla et al. Nature/) Over lapping Communities Communities defined by different nodes in a network may be quite different Scientists Physicists Department of Biological Physics Mathematicians ologists zoom Hobb Scientific Community Family Friends Schoolmates 8
Overlapping Communities • Communities defined by different nodes in a network may be quite different. 8 [Palla et al. Nature’05])
Community search Problem: Given a set of query nodes, find densely connected communities containing them query vertex State-of-the-art research focus Simple and static graphs Evolving, attributed, and oo 8a8 location-based big graphs
Community Search • Problem: Given a set of query nodes, find densely connected communities containing them. • State-of-the-art research focus: Simple and static graphs → Evolving, attributed, and location-based big graphs query vertex 9
Community Detection V.S. Community search Community detection: identify all communities. fundamental widely studied global computation(expensive) static graphs(hard to handle evolving graphs) Community search: find query-dependent communities useful less studied user-centered personalized search dynamIc graphs
Community Detection v.s. Community Search • Community detection: identify all communities. – fundamental & widely studied – global computation (expensive) – static graphs (hard to handle evolving graphs) • Community search: find query-dependent communities – useful & less studied – user-centered& personalized search – dynamic graphs 10