Graph Search Social Networks 日 Shuai ma 兆京航堂航天大学 BEIHANG UNIVERSITY
Shuai Ma Graph Search & Social Networks
The soil Food Web 6 Graphs are everywhere, and quite a few are huge graphs 山东省高遮公略图一 员普2 ▲圆 原照R 2
2 Graphs are everywhere, and quite a few are huge graphs!
Application Scenarios Software plagiarism detection [1 Traditional plagiarism detection tools may not be applicable for serious software plagiarism problems A new tool based on graph pattern matching Represent the source codes as program dependence graphs 2 se graph pattern matching to detect plagiarism int sum(int array [], int court) ant 1, sun; 2: declaration. int sum o for (i= O; i< count: 1++)[ sum= add(sum, array[i])i 6: assignment, i=0(0: declaration, int count retm int add(int ant 5: control i< count 1: declaration. retum a+ b: 8. increment i++ 7: assignment, sum=0 9: assignment, sum=addo 3: declaration. int array 4: retum retum sum 10: call-site. add( sum. armaylil
Application Scenarios 3 • Traditional plagiarism detection tools may not be applicable for serious software plagiarism problems. • A new tool based on graph pattern matching – Represent the source codes as program dependence graphs [2] . – Use graph pattern matching to detect plagiarism. Software plagiarism detection [1]
Application Scenarios Recommender systems 3 Recommendations have found its usage in many emerging specific applications, such as social matching systems Graph search is a useful tool for recommendations a headhunter wants to find a biologist (Bio)to help a group of software G engineers(sEs) analyze genetic data DM1 To do this, (s)he uses an expertise HRI Bio1 recommendation network g. as Bi depicted in g, where All Al2 v a node denotes a person labeled SE1 B102 Bio3 with expertise, and SE2 DM2 an edge indicates recommendation AlI DM Alk DMK e.g., HR, recommends Bio,, and Al, recommends DM1
Application Scenarios 4 • Recommendations have found its usage in many emerging specific applications, such as social matching systems. • Graph search is a useful tool for recommendations. Recommender systems [3] – A headhunter wants to find a biologist (Bio) to help a group of software engineers (SEs) analyze genetic data. – To do this, (s)he uses an expertise recommendation network G, as depicted in G, where ✓ a node denotes a person labeled with expertise, and ✓ an edge indicates recommendation, e.g., HR1 recommends Bio1 , and AI1 recommends DM1
Application Scenarios Transport routing 41 Graph search is a common practice in transportation networks, due to the wide application of Location-Based Services Exam ple: Mark, a driver in the U.S. who wants to go from Irvine to Riverside in california If Mark wants to reach Riverside by his car in the shortest time, the problem can be expressed as the shortest path problem. Then by using existing methods, we can get the shortest path from irvine, ca to Riverside, CA traveling along State Route 261 If Mark drives a truck delivering hazardous materials may not be allowed to cross over some bridges or railroad crossings this time we can use a pattern graph containing specific route constraints(such as regular expressions)to find the optimal transport routes
Application Scenarios 5 • Graph search is a common practice in transportation networks, due to the wide application of Location-Based Services. • Example: Mark, a driver in the U.S. who wants to go from Irvine to Riverside in California. – If Mark wants to reach Riverside by his car in the shortest time, the problem can be expressed as the shortest path problem. Then by using existing methods, we can get the shortest path from Irvine, CA to Riverside, CA traveling along State Route 261. Transport routing [4] – If Mark drives a truck delivering hazardous materials may not be allowed to cross over some bridges or railroad crossings. This time we can use a pattern graph containing specific route constraints (such as regular expressions) to find the optimal transport routes
Application Scenarios Biological data analysis 51 A large amount of biological data can be represented by graphs, and it is significant to analyze biological data with graph search techniques Protein-interaction network(PIN) analysis provides valuable insight into an organisms functional organization and evolutionary behavior.” For example, one can get the topological properties of a Pin formed by high- confidence human protein interactions obtained from various public interaction databases by Pin analysis 6
Application Scenarios 6 • A large amount of biological data can be represented by graphs, and it is significant to analyze biological data with graph search techniques. – “Protein-interaction network (PIN) analysis provides valuable insight into an organism’s functional organization and evolutionary behavior.” Biological data analysis [5] – For example, one can get the topological properties of a PIN formed by highconfidence human protein interactions obtained from various public interaction databases by PIN analysis
Outline What is graph search? Graph search why bother? Three Types of Graph Search Challenges Related techniques Summary
Outline 7 • What is graph search? • Graph search, why bother? • Three Types of Graph Search • Challenges & Related techniques • Summary
What is Graph Search? A unified definition [6](in the name of graph matching) Given a pattern graph Gp and a data graph G check whether G."matches"G, and identify all"matched subgraphs Remarks Two classes of queries Boolean queries(Yes or No) Functional queries, which may use Boolean queries as a subroutine Graphs contain a set of nodes and a set of edges, typically with labels Pattern graphs are typically small( e.g., 10), but data graphs are usually huge(e.g, 108)
What is Graph Search? 9 A unified definition [6] (in the name of graph matching): Remarks: • Given a pattern graph Gp and a data graph G: – check whether Gp ‘‘matches’’ G; and – identify all ‘‘matched’’ subgraphs. – Two classes of queries: – Boolean queries (Yes or No) – Functional queries, which may use Boolean queries as a subroutine – Graphs contain a set of nodes and a set of edges, typically with labels – Pattern graphs are typically small (e.g., 10), but data graphs are usually huge (e.g., 108 )
What is Graph Search? Different semantics of"match" implies different" types" of graph search, including, but not limited to, the following Shortest paths/distances!4 Subgraph isomorphism[) Graph homomorphism and its extensions!101 Graph simulation and its extensions 8, 9 Graph keyword search! Neighborhood queries!111 Graph search is a very general concept 10
What is Graph Search? 10 Different semantics of “match” implies different “types” of graph search, including, but not limited to, the following: • Shortest paths/distances[4] • Subgraph isomorphism[12] • Graph homomorphism and its extensions[10] • Graph simulation and its extensions[8,9] • Graph keyword search[7] • Neighborhood queries[11] • … Graph search is a very general concept!