• During the online training of OEM, _中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Online Egocentric models for citation networks

正在加载图片...

During the online training of OEM,we can also uncover of DEM will not change with the evolution of network- the evolution of topic features for each paper and the s,which will cause the accuracy of DEM to decrease over propagation of topic features between pairs of papers. time.In this section,we present our online egocentric mod- Extensive experiments on two real-world citation net- el(OEM)to solve the problems faced by DEM.The basic works are performed to demonstrate the effectiveness of idea is to adaptively update the parameters and topic features our novel model. after some new events are observed. Although we can learn the whole LDA model from the col- 2 Dynamic Egocentric Model lection of papers,it will be very time-consuming in gener- al even if we adopt the online LDA model [Hoffman et al. In this section,we briefly review the DEM [Vu et al.,2011b] 2010].Hence,in this paper we just learn the topic proportions which we base our work on.For ease of understanding,we 0 with the topics fixed.This is reasonable because the main use the same notations as those in [Vu et al.,2011b]. topics in the citation networks are relatively stable although Let n denote the total number of nodes (papers)in the the topic proportions for some papers will change over time. network.DEM tries to model a dynamic (citation)network We only need to update the whole topics after a long time pe- by placing a counting process Ni(t)on each node i(i= riod.From our experiments,we find that good performance 1.2.....n),where Ni(t)denotes the cumulative number can still be achieved by only updating the topic proportions. of events associated with node i until time t.The definition Hence,our OEM tries to minimize the following objective of events depends on context.For example,an event corre- function after observing some new events: sponds to a citation in citation networks Although a continuous-time model can be estimated by minimize-logL(B,w)+入∑lwk-03(2) maximizing the full likelihood of the counting process,for ci- 1 tation networks it is more practical to estimate the parameters subject to:Wk0,1Twk =1, associated with the time-dependent statistics at event times by maximizing the partial likelihood.Then the DEM tries to where wk is the new topic proportions of node k that need maximize the following likelihood of the whole network: to be learned and 0 is the current topic proportions of node k,w=wk=1.L(B,w)has the same definition as L(B) exp(Bsi(te)) (1) in (1)by treating both B and topic proportions as variables2, Yi(te)exp(Bsi(te)) Wk0 denotes each element in wk is non-negative,1 is a vector of all Is,the constraints are used to guarantee that all elements in w are non-negative and the summation of where m is the total number of citation events,e is the index the elements in wk is 1.A is a hyperparameter to control the of each citation event,ie denotes the paper cited in event e,te tradeoff between two terms. denotes the time of event e.Yi(t)is 1 if node i already exists When a new event or a set of new events are observed,the at time t and 0 otherwise,si(te)denotes the feature vector of second term in(2)will guarantee that the updated topic pro- node i at the time te,and 3 is a vector of parameters to learn. portion wk will not be too far away from the current topic The features in si(te)can be divided into two types.One proportion 0.Furthermore,we use the current B as initial- type is called link features (statistics),and the other type is ization to get the updated 3.Hence,by effectively using the called topic features.In [Vu et al.,2011b],eight link fea- information of the existing events,we successfully get an on- tures,including three preferential attachment statistics,three line learning algorithm. triangle statistics and two out-path statistics,are extracted for It is easy to see that the optimization problem in(2)is not each node.Fifty topic features are extracted by performing jointly convex in(B,w).But we can prove that the objective Latent Dirichlet Allocation (LDA)[Blei et al..2003]on the function is convex in either 3 or w with the other variable abstracts of the papers.More specifically,assuming the arriv- fixed.In this paper,we design an alternating projection algo- ing paper is i at the time te.we can compute the topic features rithm to find the solutions.More specifically,each time we of any existing paper j as follows: fix one variable and then update the other one.The procedure sDA(t)=0:08 is briefly outlined as follows: online 3 step:Fix w,and apply Newton's method to where 0;denotes the topic proportion of paper i,o denotes update the parameter B by using the current B for ini- the element-wise multiplication. tialization; Hence,si(te)is a vector of 58 features,with the first 8 fea- online topic step:Fix 3,and minimize the problem in tures being link features and the last 50 features being topic features.Correspondingly,B is a vector of length 58.More (2)to get the updated topic proportions wk based on the details about the features can be found in Vu et al.,2011b]. current topic proportions 6k. The above procedure will be repeated for several iterations 3 Online Egocentric Model until some termination condition is satisfied.We can prove that the learning algorithm is convergent. During the prediction process for evolving networks,the link features of nodes will be dynamically updated in DEM.How- 2Please note L(B,w)is different from L(B).In L(B).only B is ever,both the learned parameters (B)and topic features (0i) a variable and w is a constant (fixed value).• During the online training of OEM, we can also uncover the evolution of topic features for each paper and the propagation of topic features between pairs of papers. • Extensive experiments on two real-world citation networks are performed to demonstrate the effectiveness of our novel model. 2 Dynamic Egocentric Model In this section, we briefly review the DEM [Vu et al., 2011b] which we base our work on. For ease of understanding, we use the same notations as those in [Vu et al., 2011b]. Let n denote the total number of nodes (papers) in the network. DEM tries to model a dynamic (citation) network by placing a counting process Ni(t) on each node i (i = 1, 2, · · · , n) , where Ni(t) denotes the cumulative number of events associated with node i until time t. The definition of events depends on context. For example, an event corresponds to a citation in citation networks. Although a continuous-time model can be estimated by maximizing the full likelihood of the counting process, for citation networks it is more practical to estimate the parameters associated with the time-dependent statistics at event times by maximizing the partial likelihood. Then the DEM tries to maximize the following likelihood of the whole network: L(β) = Ym e=1 exp(β T sie (te)) Pn i=1 Yi(te) exp(β T si(te)) , (1) where m is the total number of citation events, e is the index of each citation event, ie denotes the paper cited in event e, te denotes the time of event e, Yi(t) is 1 if node i already exists at time t and 0 otherwise, si(te) denotes the feature vector of node i at the time te, and β is a vector of parameters to learn. The features in si(te) can be divided into two types. One type is called link features (statistics), and the other type is called topic features. In [Vu et al., 2011b], eight link features, including three preferential attachment statistics, three triangle statistics and two out-path statistics, are extracted for each node. Fifty topic features are extracted by performing Latent Dirichlet Allocation (LDA) [Blei et al., 2003] on the abstracts of the papers. More specifically, assuming the arriving paper is i at the time te, we can compute the topic features of any existing paper j as follows: s LDA j (te) = θi ◦ θj , where θi denotes the topic proportion of paper i, ◦ denotes the element-wise multiplication. Hence, si(te) is a vector of 58 features, with the first 8 features being link features and the last 50 features being topic features. Correspondingly, β is a vector of length 58. More details about the features can be found in [Vu et al., 2011b]. 3 Online Egocentric Model During the prediction process for evolving networks, the link features of nodes will be dynamically updated in DEM. However, both the learned parameters (β) and topic features (θi) of DEM will not change with the evolution of networks, which will cause the accuracy of DEM to decrease over time. In this section, we present our online egocentric model (OEM) to solve the problems faced by DEM. The basic idea is to adaptively update the parameters and topic features after some new events are observed. Although we can learn the whole LDA model from the collection of papers, it will be very time-consuming in general even if we adopt the online LDA model [Hoffman et al., 2010]. Hence, in this paper we just learn the topic proportions θ with the topics fixed. This is reasonable because the main topics in the citation networks are relatively stable although the topic proportions for some papers will change over time. We only need to update the whole topics after a long time period. From our experiments, we find that good performance can still be achieved by only updating the topic proportions. Hence, our OEM tries to minimize the following objective function after observing some new events: minimize − log L(β, ω) + λ Pn k=1 kωk − θkk 2 2 (2) subject to : ωk 0, 1 T ωk = 1, where ωk is the new topic proportions of node k that need to be learned and θk is the current topic proportions of node k, ω = {ωk} n k=1, L(β, ω) has the same definition as L(β) in (1) by treating both β and topic proportions as variables2 , ωk 0 denotes each element in ωk is non-negative, 1 is a vector of all 1s, the constraints are used to guarantee that all elements in ωk are non-negative and the summation of the elements in ωk is 1, λ is a hyperparameter to control the tradeoff between two terms. When a new event or a set of new events are observed, the second term in (2) will guarantee that the updated topic proportion ωk will not be too far away from the current topic proportion θk. Furthermore, we use the current β as initialization to get the updated β. Hence, by effectively using the information of the existing events, we successfully get an online learning algorithm. It is easy to see that the optimization problem in (2) is not jointly convex in (β, ω). But we can prove that the objective function is convex in either β or ω with the other variable fixed. In this paper, we design an alternating projection algorithm to find the solutions. More specifically, each time we fix one variable and then update the other one. The procedure is briefly outlined as follows: • online β step: Fix ω, and apply Newton’s method to update the parameter β by using the current β for initialization; • online topic step: Fix β, and minimize the problem in (2) to get the updated topic proportions ωk based on the current topic proportions θk. The above procedure will be repeated for several iterations until some termination condition is satisfied. We can prove that the learning algorithm is convergent. 2 Please note L(β, ω) is different from L(β). In L(β), only β is a variable and ω is a constant (fixed value)

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Online Egocentric models for citation networks