正在加载图片...
WANG AND LI:RELATIONAL COLLABORATIVE TOPIC REGRESSION FOR RECOMMENDER SYSTEMS 1345 Matrix factorization [25],[35]and its extensions,such as the probabilistic matrix factorization (PMF)[35],are the most representative model-based methods which in practice have achieved promising performance.The basic idea of MF is to use latent vectors in a low-dimensional space to repre- sent the users and items.More specifically,user i is repre- sented by a latent vector uiE RK of dimensionality K,and item j is represented by a latent vector vjE RK.The predic- Fig.1.The graphical model of collaborative topic regression. tion of the feedback on item j given by user i can be com- puted as follows: article is about(represented by ,)and what the users think =5 of it (represented by vj),as discussed in [45].If we use If we use two latent matrices U=(u)and V=(v)to B=B:k to denote K topics,the generative process of CTR can be listed as follows: denote the latent vectors for all the users and items respec- tively,it means MF has learnt to find the optimal U and V 1)Draw a user latent vector for each user i: by optimizing the following objective function: ui~N(0,XIk). 3 For each item j, v:l a) (1 Draw topic proportions;~Dirichlet(@). 0 Draw item latent offset j~N(0,AIK),then set the item latent vector to be:vj=ej+0j. where·‖denotes the Frobenius norm of a vector,,入uand c) For each word in the document (item)wi,which A are regularization terms for controlling model complex- is denoted as wj, ity.The objective function in (1)corresponds to the maxi- i)Draw topic assignment zin ~Mult(@j). mum a posteriori(MAP)estimate of the PMF model in [35]. ii)Draw word win ~Mult(B.). In [45],a generalization of PMF model is proposed 3) Draw the feedback rij for each user-item pair(i,j), i~N(0,XIK), rN(u,写) vj~N(0,IK). (2) As mentioned in [45],the key to CTR lies in the item r~N(,写) latent offset ej,which makes the item latent vector vj close enough to the topic proportions 0;and diverge from it if where N()denotes the normal distribution,Ik is an iden- necessary.Parameter Ae controls how close v;is to 0j. tity matrix with K rows and columns,and cij is defined as Experiments on scientific article recommendation from follows: CiteULike show that CTR can outperform MF based CF methods. a, if rij=1, b. if rij =0, 3 RELATIONAL COLLABORATIVE TOPIC with a and b being tuning parameters and a>b>0.Please REGRESSION note that all the (i,j)pairs with feedback 0 in the training In this section,we describe the details of our proposed set are used for training in this paper.We can also sample model,called Relational Collaborative Topic Regression. part of them for training in cases where the number of zeros Besides the feedback and item content information modeled in the feedback matrix is too large. by CTR,RCTR can also model the relations among the items MF methods have achieved promising performance in which are informative for recommendations. practice.However,they also suffer from the sparsity prob- lem.Furthermore,as stated in [45],it is not easy for MF 3.1 Model Formulation methods to perform out-of-matrix prediction. To better illustrate the graphical model of RCTR,we adopt a way different from that in Fig.1 which is adopted by the 2.3 Collaborative Topic Regression authors of CTR [45].The graphic model of RCTR is shown Collaborative topic regression [45]is proposed to recom- in Fig.2,in which the component in the dashed rectangle is mend documents(papers)to users by seamlessly integrat- what differentiates RCTR from CTR. ing both feedback matrix and item (document)content The generative process of RCTR is as follows: information into the same model,which can address the problems faced by MF based CF.By combining MF and 1)Draw a user latent vector for each user i:ui~ latent Dirichlet allocation (LDA)[9],CTR achieves better W(0,λ1Ik) prediction performance than MF based CF with better inter- 2) For each item j, pretable results.Moreover,with the item content informa- a) Draw topic proportions ;Dirichlet(@). tion,CTR can predict feedback for out-of-matrix items. b) Draw item latent offset j~N(0,AIK),then set The graphical model of CTR is shown in Fig.1. the item latent vector to be:vj=ej+0j. CTR introduces an item latent offset e;between the topic c) Draw item relational offset j~N(0,IK), proportions 0j in LDA and the item latent vectors vj in CF. then set the item relational vector to be: The offset can be explained by the gap between what the Sj=Tj十UjMatrix factorization [25], [35] and its extensions, such as the probabilistic matrix factorization (PMF) [35], are the most representative model-based methods which in practice have achieved promising performance. The basic idea of MF is to use latent vectors in a low-dimensional space to repre￾sent the users and items. More specifically, user i is repre￾sented by a latent vector ui 2 RK of dimensionality K, and item j is represented by a latent vector vj 2 RK. The predic￾tion of the feedback on item j given by user i can be com￾puted as follows: r^ij ¼ uT i vj: If we use two latent matrices U ¼ ðuiÞ I i¼1 and V ¼ ðvjÞ J j¼1 to denote the latent vectors for all the users and items respec￾tively, it means MF has learnt to find the optimal U and V by optimizing the following objective function: min U;V X I i¼1 X J j¼1 rij uT i vj 2 þ u X I i¼1 kuik2 þ v X J j¼1 kvjk2 ; (1) where kk denotes the Frobenius norm of a vector, u and v are regularization terms for controlling model complex￾ity. The objective function in (1) corresponds to the maxi￾mum a posteriori (MAP) estimate of the PMF model in [35]. In [45], a generalization of PMF model is proposed ui Nð0; 1 u IKÞ; vj Nð0; 1 v IKÞ; rij NðuT i vj; c1 ij Þ; (2) where N ðÞ denotes the normal distribution, IK is an iden￾tity matrix with K rows and columns, and cij is defined as follows: cij ¼ a; if rij ¼ 1; b; if rij ¼ 0;  with a and b being tuning parameters and a>b> 0. Please note that all the ði; jÞ pairs with feedback 0 in the training set are used for training in this paper. We can also sample part of them for training in cases where the number of zeros in the feedback matrix is too large. MF methods have achieved promising performance in practice. However, they also suffer from the sparsity prob￾lem. Furthermore, as stated in [45], it is not easy for MF methods to perform out-of-matrix prediction. 2.3 Collaborative Topic Regression Collaborative topic regression [45] is proposed to recom￾mend documents (papers) to users by seamlessly integrat￾ing both feedback matrix and item (document) content information into the same model, which can address the problems faced by MF based CF. By combining MF and latent Dirichlet allocation (LDA) [9], CTR achieves better prediction performance than MF based CF with better inter￾pretable results. Moreover, with the item content informa￾tion, CTR can predict feedback for out-of-matrix items. The graphical model of CTR is shown in Fig. 1. CTR introduces an item latent offset j between the topic proportions uj in LDA and the item latent vectors vj in CF. The offset can be explained by the gap between what the article is about (represented by uj) and what the users think of it (represented by vj), as discussed in [45]. If we use b ¼ b1:K to denote K topics, the generative process of CTR can be listed as follows: 1) Draw a user latent vector for each user i: ui Nð0; 1 u IKÞ. 2) For each item j, a) Draw topic proportions uj  DirichletðaÞ. b) Draw item latent offset j Nð0; 1 v IKÞ, then set the item latent vector to be: vj ¼ j þ uj. c) For each word in the document (item) wj, which is denoted as wjn, i) Draw topic assignment zjn  MultðujÞ. ii) Draw word wjn  Multðbzjn Þ. 3) Draw the feedback rij for each user-item pair ði; jÞ, rij NðuT i vj; c1 ij Þ: As mentioned in [45], the key to CTR lies in the item latent offset j, which makes the item latent vector vj close enough to the topic proportions uj and diverge from it if necessary. Parameter v controls how close vj is to uj. Experiments on scientific article recommendation from CiteULike show that CTR can outperform MF based CF methods. 3 RELATIONAL COLLABORATIVE TOPIC REGRESSION In this section, we describe the details of our proposed model, called Relational Collaborative Topic Regression. Besides the feedback and item content information modeled by CTR, RCTR can also model the relations among the items which are informative for recommendations. 3.1 Model Formulation To better illustrate the graphical model of RCTR, we adopt a way different from that in Fig. 1 which is adopted by the authors of CTR [45]. The graphic model of RCTR is shown in Fig. 2, in which the component in the dashed rectangle is what differentiates RCTR from CTR. The generative process of RCTR is as follows: 1) Draw a user latent vector for each user i: ui  N ð0; 1 u IKÞ. 2) For each item j, a) Draw topic proportions uj  DirichletðaÞ. b) Draw item latent offset j Nð0; 1 v IKÞ, then set the item latent vector to be: vj ¼ j þ uj. c) Draw item relational offset tj Nð0; 1 r IKÞ, then set the item relational vector to be: sj ¼ tj þ vj. Fig. 1. The graphical model of collaborative topic regression. WANG AND LI: RELATIONAL COLLABORATIVE TOPIC REGRESSION FOR RECOMMENDER SYSTEMS 1345
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有