we have ∂g ∂Ud∗ = WU T d∗ − x, (16) w

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）agiCoFi - Tag informed collaborative filtering

正在加载图片...

we have 1.How does TagiCoFi perform in real applications when ag=WUN.-x. compared with state-of-the-art methods? (16) aUa. 2.How effective are the different user similarity mea- where W is an Nx N diagonal matrix with W=YV, sures? and x is an N x 1 vector with i=YijVa(Rij- 3.How does tagging information improve collaborative UnV.;+UaiVa). filtering? Then,we can get 4.How does the number of latent features used affect the ag+00: performance of TagiCoFi? aUa∂Uu* =(W+aI+BC)U2.-x. 5.Does TagiCoFi work for users without any training (17) ratings? The learning process of V is different from that of U These questions are answered separately:question 1 in because the columns (not rows)of V are decoupled.Hence Section 4.3,questions 2-4 in Section 4.4 as three different we apply gradient descent to optimize one column of V at subsubsections,and question 5 in Section 4.5. a time with the other columns fixed.The gradient can be computed as follows: 4.1 Data Set af We evaluate our algorithm on the MovieLens datasets V对 =(al+YUUa)V-】 Yig RijU.i.(18) which,as far as we know,is the only publicly available dataset containing both tagging and rating information. The overall learning procedure of TagiCoFi is summarized We first prune the dataset for our analysis.For the tag- in Algorithm 1 below. ging information,we only keep those tags which are added on at least three distinct movies.As for the users,we only keep those users who used at least 3 distinct tags in their Algorithm 1 Learning procedure of TagiCoFi tagging history.For movies,we only keep those movies that 1:INPUT: are annotated by at least 3 distinct tags.It should be em- R-rating matrix phasized that our model still works under situations where Z-tagging matrix there are users or movies with rating information only but D-number of latent features no tagging information.For those users without any tag- W-number of iterations ging information,the tag-based similarities between them 6-step size for gradient descent and the other users are 0,which means that the last term 2:Compute user similarity matrix S based on Z in (11)will have no effect on those users.Subsequently,the 3:Compute Laplacian matrix C based on S recommendation result for those users without tagging in- 4:Initialize U,Vo formation only depends on the MF procedure of the rating 5:for w=1 to W do matrix.which is similar to the result of PMF.As the focus 6: for d=1 to D do of this paper is on evaluating the effectiveness of tagging 7: U2-U -6. information in addition to rating information,we only keep g end for the users who have both rating history and tagging history 9: for j=1 to M do in the original rating records. 10: V号-V写-i We obtain two kinds of records after pruning,the tag- 11: end for ging records and the rating records.The tagging records in- 12:end for clude 13,431 tagging applications contributed by 757 users 13:return UW,Vw with 2,271 distinct tags.Based on the tagging records,we construct the tagging matrix Z.whose elements are defined by Equation (1)in Section 2.1.The rating records include 3.4 Complexity Analysis 167,474 ratings rated by 757 users (the same as those in the tagging records)on 9,485 movies,and based on these rating The main computation of TagiCoFi is to evaluate the gra records we construct the rating matrix R.More statistics dients of the objective function with respect to the latent about the rating matrix R are shown in Table 1,where the variables and to compute the user similarities.The time numbers behind+denote the standard deviations. complexity of computing the gradient is(ND))and that ofis(NMD).The time complexity of comput Table 1:Description of rating data ing the user similarities and C is O(N2K).Hence,the time Statistics Users Movies complexity of the entire alternating gradient descent proce- Min.of ratings 20 dure is O(W(N2D+NMD)+N2K). Max.of ratings 2.634 625 Mean#of ratings441.95士420.8835.27士67.30 4.EXPERIMENTAL EVALUATION We have conducted several experiments to compare the performance of our method with that of other methods 5http://www.grouplens.org/node/73 Through the experiments,we have tried to answer the fol- 6If user i adds tag k on item j,we say this is a tagging lowing questions: application.we have ∂g ∂Ud∗ = WU T d∗ − x, (16) where W is an N×N diagonal matrix with Wii = PM j=1 YijV 2 dj , and x is an N × 1 vector with xi = PM j=1 YijVdj (Rij − UT ∗iV∗j + UdiVdj ). Then, we can get ∂f ∂Ud∗ = ∂g ∂Ud∗ + ∂h ∂Ud∗ = (W + αI + βL)U T d∗ − x. (17) The learning process of V is different from that of U, because the columns (not rows) of V are decoupled. Hence, we apply gradient descent to optimize one column of V at a time with the other columns fixed. The gradient can be computed as follows: ∂f ∂V∗j = “ αI + XN i=1 YijU∗iU T ∗i ” V∗j − XN i=1 YijRijU∗i. (18) The overall learning procedure of TagiCoFi is summarized in Algorithm 1 below. Algorithm 1 Learning procedure of TagiCoFi 1: INPUT: R – rating matrix Z – tagging matrix D – number of latent features W – number of iterations δ – step size for gradient descent 2: Compute user similarity matrix S based on Z 3: Compute Laplacian matrix L based on S 4: Initialize U0 , V0 5: for w = 1 to W do 6: for d = 1 to D do 7: Uw d∗ ← Uw−1 d∗ − δ ∂f ∂Ud∗ 8: end for 9: for j = 1 to M do 10: Vw ∗j ← Vw−1 ∗j − δ ∂f ∂V∗j 11: end for 12: end for 13: return UW , VW 3.4 Complexity Analysis The main computation of TagiCoFi is to evaluate the gradients of the objective function with respect to the latent variables and to compute the user similarities. The time complexity of computing the gradient ∂f ∂Ud∗ is O(N 2D)) and that of ∂f ∂V∗j is O(NMD). The time complexity of computing the user similarities and L is O(N 2K). Hence, the time complexity of the entire alternating gradient descent procedure is O(W(N 2D + NMD) + N 2K). 4. EXPERIMENTAL EVALUATION We have conducted several experiments to compare the performance of our method with that of other methods. Through the experiments, we have tried to answer the following questions: 1. How does TagiCoFi perform in real applications when compared with state-of-the-art methods? 2. How effective are the different user similarity measures? 3. How does tagging information improve collaborative filtering? 4. How does the number of latent features used affect the performance of TagiCoFi? 5. Does TagiCoFi work for users without any training ratings? These questions are answered separately: question 1 in Section 4.3, questions 2–4 in Section 4.4 as three different subsubsections, and question 5 in Section 4.5. 4.1 Data Set We evaluate our algorithm on the MovieLens dataset5 , which, as far as we know, is the only publicly available dataset containing both tagging and rating information. We first prune the dataset for our analysis. For the tagging information, we only keep those tags which are added on at least three distinct movies. As for the users, we only keep those users who used at least 3 distinct tags in their tagging history. For movies, we only keep those movies that are annotated by at least 3 distinct tags. It should be emphasized that our model still works under situations where there are users or movies with rating information only but no tagging information. For those users without any tagging information, the tag-based similarities between them and the other users are 0, which means that the last term in (11) will have no effect on those users. Subsequently, the recommendation result for those users without tagging information only depends on the MF procedure of the rating matrix, which is similar to the result of PMF. As the focus of this paper is on evaluating the effectiveness of tagging information in addition to rating information, we only keep the users who have both rating history and tagging history in the original rating records. We obtain two kinds of records after pruning, the tagging records and the rating records. The tagging records include 13,431 tagging applications6 contributed by 757 users with 2,271 distinct tags. Based on the tagging records, we construct the tagging matrix Z, whose elements are defined by Equation (1) in Section 2.1. The rating records include 167,474 ratings rated by 757 users (the same as those in the tagging records) on 9,485 movies, and based on these rating records we construct the rating matrix R. More statistics about the rating matrix R are shown in Table 1, where the numbers behind ± denote the standard deviations. Table 1: Description of rating data Statistics Users Movies Min. # of ratings 20 1 Max. # of ratings 2,634 625 Mean # of ratings 441.95 ± 420.88 35.27 ± 67.30 5 http://www.grouplens.org/node/73 6 If user i adds tag k on item j, we say this is a tagging application

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）agiCoFi - Tag informed collaborative filtering