Lecture 6 Graph Mining
Lecture 6 Graph Mining
Background Graph Mining Networks/Graphs--exist everywhere. Food Web of Smallmouth Bass Leech Little Rock Lake (Cannibal) 1st Tropic Level Mosty Phytoplankton 2nd Trophic Level Many Zooplankton Internet Food Web The Social Stcture of Coutryside"Schod District T和出Ody2 Friendship Network Co-author networks
Internet Food Web Friendship Network Co-author networks Networks/Graphs -- exist everywhere. Background Graph Mining
Background Graph Mining Origin of Graph Theory: Seven Bridges of Konigsberg(proposed by Leonhard Euler()) --wiki The city of Konigsberg in Prussia(now Kaliningrad,Russia)was set on both sides of the Pregel River,and included two large islands which were connected to each other,or to the two mainland portions of the city,by seven bridges.The problem was to devise a walk through the city that would cross each of those bridges once and only once
Origin of Graph Theory: Seven Bridges of Königsberg (proposed by Leonhard Euler(欧拉)) The city of Königsberg in Prussia (now Kaliningrad, Russia) was set on both sides of the Pregel River, and included two large islands which were connected to each other, or to the two mainland portions of the city, by seven bridges. The problem was to devise a walk through the city that would cross each of those bridges once and only once. --wiki Background Graph Mining
Background Graph Mining Network/Graph-construct Data Instance Graph Instance Element Vertex Element's Attributes >Vertex Label Relation Between Edge Two Elements Type Of Relation Edge Label Relation between a →Hyper Edge Set of Elements Provide enormous flexibility for modeling the underlying data as they allow the modeler to decide on what the elements should be and the type of relations to be modeled
Network/Graph – construct Element Vertex Element’s Attributes Relation Between Two Elements Type Of Relation Vertex Label Edge Label Edge Data Instance Graph Instance Relation between a Set of Elements Hyper Edge Provide enormous flexibility for modeling the underlying data as they allow the modeler to decide on what the elements should be and the type of relations to be modeled Background Graph Mining
Background Graph Mining Networks/Graphs-applications in real-world? Information Maximization ·viral'marketing. Web-log ('blog')news propagation. Computer network security Email/IP traffic and anomaly detection ·Prediction The prediction of flow within networks
Networks/Graphs – applications in real-world? • Information Maximization • ‘viral’ marketing. • Web-log (‘blog’) news propagation. • Computer network security • Email/IP traffic and anomaly detection • Prediction • The prediction of flow within networks Background Graph Mining
Background Graph Mining Networks types: REGULAR RANDOM SMALL WORLD SCALE-FREE 1950s,Erdos 1998,Watts 1999,Barabasi and Renyi and Strogatz and Albert T
Networks types: REGULAR SMALL WORLD RANDOM SCALE-FREE T 1950s,Erdos and Renyi 1998,Watts and Strogatz 1999,Barabasi and Albert Background Graph Mining
Background Graph Mining Small world network: Six degree of separation The average distance between two randomly individuals in the USA:6 The average distance between two 10234④5⑤6( randomly users in Facebook(721 million active users,69 billion links )4.74
Small world network: Six degree of separation ¾ The average distance between two randomly individuals in the USA : 6 ¾ The average distance between two randomly users in Facebook(721 million active users, 69 billion links ) : 4.74 ¾ … Background Graph Mining
Background Graph Mining Scale-Free networks: 100 10 10 10 10 10 10 degree Degree distribution 62 Social network A scale-free network is a network whose degree distribution follows a power law. That i is,the fraction P的 of nodes in the network DEBREE OF NODES having k connections to other nodes goes for large values of k as P(k)~kT,where r is a parameter Power law whose value is typically in the range(2,3)
Scale-Free networks: A scale-free network is a network whose degree distribution follows a power law. That is, the fraction P(k) of nodes in the network having k connections to other nodes goes for large values of k as ି, where is a parameter whose value is typically in the range (2,3). Social network Degree distribution Power law Background Graph Mining
Key Node ldentification
Key Node Identification
Key Node ldentification Graph Mining >How should we select some special customers to viral market efficiently if we have a limited budget? >Which countries have a huge impact on the global trade? >Who hold great power on the terrorist organization? >Which node is the weakest position in the power supply network? 丝绸之路 微哈拉沙流 4m千 来品 八千方勿之
¾ How should we select some special customers to viral market efficiently if we have a limited budget? ¾ Which countries have a huge impact on the global trade? ¾ Who hold great power on the terrorist organization? ¾ Which node is the weakest position in the power supply network? Background Key Node Identification Graph Mining