Recent development of Heterogeneous Information Networks From Meta-paths to Meta-graphs Yangqiu song Department of CSE, HKUST, Hong Kong 香港科技大學 THE HONG KONG UNIVERSITY OF SCIENCE AND TECHNOLOGY
Recent Development of Heterogeneous Information Networks: From Meta-paths to Meta-graphs Yangqiu Song Department of CSE, HKUST, Hong Kong 1
Homogeneous Graph/Networks Social Network Transportation Network Gene network Food network http:/lsnap.stanford,edu/higher-order/higher-order-sm-science16.pdf
Homogeneous Graph/Networks 2 Social Network Transportation Network Gene Network Food Network
Heterogeneous Information Networks Yizhou Sun, Jiawei Han, et al. 2009-2012(UIUC Entity type mapping: V->A Link type mapping: E-> R reterences watche purchased http://web.cs.uclaedu/myzsun/tutorials.htm 名
Heterogeneous Information Networks • Yizhou Sun, Jiawei Han, et al., 2009-2012 (UIUC) – Entity type mapping: V -> A – Link type mapping: E -> R 3
Modern social media Entities: Person Check-in location articles, etc Relations: Friends, Like, Check-in etc listen SHHIY
Modern Social Media • Entities: Person, Check-in location, Articles, etc. • Relations: Friends, Like, Check-in, etc. 4
Scholar networks Entities: Paper, Venue, Author, Keyword, etc Relations: Write, Attend Contain, etc erm Venue KDD PathSim Paper VLDB Author Venue Pa Author DBLP Bibliographic Network http://web.cs.uclaedu/myzsun/tutorials.htm
Scholar Networks • Entities: Paper, Venue, Author, Keyword, etc. • Relations: Write, Attend, Contain, etc. 5 Venue Paper Author DBLP Bibliographic Network
Knowledge Graphs Example of entities and their relations A cure Headquarter RunBy CEO Location Organization CEO FoundedBy∴ Runbusiness Mailing Address windward Founder Industry
Knowledge Graphs 6 • Example of entities and their relations:
Bio-medical network Entities: Gene, Patient, Drug, Disease, etc Relations: Drug repurposing, genotyping, etc Symptom Gene carrledBy Drug Network Patient Microbe Genomic cause Medicine Disease Repurposing 3 Disease Side Patient Network Effect ug Phenotype/ Genotype Disease Network Associati Gene Network http://web.cs.uclaedu/yzsun/tutorials.htm
Bio-medical Network • Entities: Gene, Patient, Drug, Disease, etc. • Relations: Drug repurposing, Genotyping, etc. 7
Problems in hin Link Prediction Homogeneous 40 MYBL2 Gene-Phenotype Heterogeneous: recommendation creDere Entity typing/Profiling May 4 Happy maythefourthbewithyou ag person Darth Vader ait Daah tweets location URL URD URL) The White House organization fic kr//75XWNy Similarity search Meta-Path: Author-Paper-Author Rank Author Christos faloutsos Spiros Papadimitriou127 Jimeng Sun .ia-Yu Pan 0.114 Agma M. Traina O.110 .ure Leskovec 0,096 Caetano fraina r Hanghang Tong 0.091 Deepayan Chakrabarti0083 0.053 http://web.cs.uclaedu/ayzsun/tutorials.htm Christos students or close collaborators
Problems in HIN • Link Prediction – Homogeneous – Heterogeneous: recommendation • Entity Typing/Profiling • Similarity Search 8 Christos’ students or close collaborators Meta-Path: Author-Paper-Author
Explicit VS Implicit Flat" Semantics Explicit Semantic Analysis [Gabrilovich and Markovitch06, 07, 09 Represent Barack Obama text as bag Timeline of the presidency of Barack Obama(2009) Family of barack obama of Wikipedia Barack Obama citizenship conspiracy theories titles Barack obama Barack Obama presidential primary campaign 2008 Probabilistic Conceptualization Song et al. 11, 15 0.35 emerging 0.3 market Given "China, India 0.25 Russia. Brazil, retrieve 0.2 concepts from Probase 15 emerging emerging emerging [Wu et al., SIGMOD'12 economy country power bric country 0.1 economy country emerging 0.05 nation 9
Explicit vs. Implicit “Flat” Semantics • Explicit Semantic Analysis [Gabrilovich and Markovitch ’06, ’07, ’09] • Probabilistic Conceptualization [Song et al., ’11,’15] 9 Barack Obama Timeline of the presidency of Barack Obama (2009) Family of Barack Obama Barack Obama citizenship conspiracy theories Barack Obama Barack Obama presidential primary campaign 2008 Represent text as bag of Wikipedia titles Given “China, India, Russia, Brazil”, retrieve concepts from Probase [Wu et al., SIGMOD’12]
Explicit Vs Implicit Flat" Semantics Implicit Semantic analysis oftmax classifier {W4 SVD [ Deerwester et al., JASIS 90 PLSA Hofmann, NIPS 99 Hidden layer DA Blei et al., JMLR03 embeddings Word2vec[Nikolov et al. NIPS 13 ojection layer he cat sits on the mat contexthistory h Italy Ger man walked Berlin Turke swam Russ⊥a king MoscoW walki ng Canada queen Japan Tokyo 日anoi swimming china Beijing Male-Female Verb tense Country-Capital
Explicit vs. Implicit “Flat” Semantics • Implicit Semantic Analysis – SVD [Deerwester et al., JASIS’90] – PLSA [Hofmann, NIPS’99] – LDA [Blei et al., JMLR’03] – Word2vec [Mikolov et al., NIPS’13] – … 10