正在加载图片...
tract sufficient feature information, which reflects the where iE N+, j E M+ and Tii E [1, Rmaz]. The usual problem of data sparsity. It is because they fit the orig evaluation process is the hold-out cross validation]. A nal matrix by feature extraction only based on the rat certain proportion of ratings are hidden for testing and ing data while the rating data are extremely sparse. If the rest are used for training. The measures of evalua- ve could obtain more ratings, we would surely enhance tion include complexity and accuracy. Nevertheless, the he quality of fitting process. From this standing point much more important because most of the we propose a better collaborative filtering approach to Collaborative Filtering approaches are offline. There exploit additional knowledge from the tags as a supple- fore, it is the focus in this paper nent to ratings. Tags are simple, ad-hoc labels assigned by users to de Naive Estimates scribe or annotate any kind of resource for future re- One of the most instinctive predicting methods is to a user's perspective and preference with ease. Most re- item's average biases involved, we get the naive estimate ent work focuses on the tag recommendation in which the objects to recommend are tags [18, 20, 22, 27. In the case of item-based recommendation, users expect bx=μ+b+b;, to get specific suggestion on which item might be in- where bij indicate the predicted rating of user on itemj teresting. There are a limited number of solutions for p is the global average rating; bi and bj denote useri's this situation, and most of them do not have a gen d item,'s average bias, respectively ralized adaptation to different data resources because they ignore abundant rating data 11, 25. In this paper, This naive method is effective and scalable, but it does we offer a novel personalized recommendation method not take the interaction between users into account ev- which matches the case of containing both ratings and ery user's rating for a item has infuences on other users opinions to that item. This interdependence between the users forms a social network 24 which connects all Our approach still shares the main idea of classic neigh- users together. The personalized recommendations are borhood method. but there are some differences in where not delivered in isolation, but in the context of this so- to find neighbors. The neighbors are usually found in cial network [14]. The neighborhood method is one of the ratings for the traditional CF approach[1. We do the most effective methods to analyze the context not find neighbors directly by this means. First we ex- ploit the latent topic grouping information hidden in tags and then we find groups of the users interested in Neighborhood Method similar topics and collections of the items under sim- he aim of the neighborhood method 2 is to find the ilar topics. To predict the user's rating for the item ers who give similar ratings and the items which re- we consult the ratings of both of the user's and the ceive similar ratings. The approximate ratings infer the item's neighbors by employing a neighborhood method. potential similarity of the future ratings. This is the Thanks to taking into account both tag neighbors and basic assumption of collaborative filtering. Because the rating neighbors, our method outperforms most popular neighborhood method digs out from the neighbors the CF a clues that indicate the potential ratings, it produces better predictions than the naive estimate. The model of the neighborhood method unifying item-based and Section 2 we introduce the background and the related user-based collaborative filtering approaches is rative filtering method in details.Setn4wegv=b+∑硎,(h-b)+∑(-b3) two toy examples and compare our method with NMF h∈Sk(jii) h∈sk(i:y) PMF and SVD on a popular movie dataset. And finally we conclude this paper with some future work where Fi is the predicted rating: bij refers to the naive estimate's prediction; G; i)denotes the set includ- ng k nearest rated items neighboring with item, for a PRELIMINARIES given user and Thi; S(i; j)denotes the set including k Rating prediction is one of the most popular means to nearest users neighboring with user: for a given item; valuate the performance of collaborative filtering al and rih: 0 reflects the different weights of Tih. There gorithms. From the rating data of most collaborative are several representations for the weights. The cosine filtering datasets, we can obtain a N x M rating matrix similarity is one the most effective measures to indicate R including N users and M items. Matrix R is defined the different weights Cosine Similarty useris rating for item,, if user has rated item otherwis where a and b are both vectors with the same dimensiontract sufficient feature information, which reflects the problem of data sparsity. It is because they fit the orig￾inal matrix by feature extraction only based on the rat￾ing data while the rating data are extremely sparse. If we could obtain more ratings, we would surely enhance the quality of fitting process. From this standing point, we propose a better collaborative filtering approach to exploit additional knowledge from the tags as a supple￾ment to ratings. Tags are simple, ad-hoc labels assigned by users to de￾scribe or annotate any kind of resource for future re￾trieval. Their flexibility means they probably capture a user’s perspective and preference with ease. Most re￾cent work focuses on the tag recommendation in which the objects to recommend are tags [18, 20, 22, 27]. In the case of item-based recommendation, users expect to get specific suggestion on which item might be in￾teresting. There are a limited number of solutions for this situation, and most of them do not have a gen￾eralized adaptation to different data resources because they ignore abundant rating data [11, 25]. In this paper, we offer a novel personalized recommendation method which matches the case of containing both ratings and tags. Our approach still shares the main idea of classic neigh￾borhood method, but there are some differences in where to find neighbors. The neighbors are usually found in the ratings for the traditional CF approach [1]. We do not find neighbors directly by this means. First we ex￾ploit the latent topic grouping information hidden in tags and then we find groups of the users interested in similar topics and collections of the items under sim￾ilar topics. To predict the user’s rating for the item, we consult the ratings of both of the user’s and the item’s neighbors by employing a neighborhood method. Thanks to taking into account both tag neighbors and rating neighbors, our method outperforms most popular CF approaches. The structure of the rest of the paper is as follows. In Section 2 we introduce the background and the related works. In section 3 we explain our improved collabo￾rative filtering method in details. In Section 4 we give two toy examples and compare our method with NMF, PMF and SVD on a popular movie dataset. And finally we conclude this paper with some future work. PRELIMINARIES Rating prediction is one of the most popular means to evaluate the performance of collaborative filtering al￾gorithms. From the rating data of most collaborative filtering datasets, we can obtain a N ×M rating matrix R including N users and M items. Matrix R is defined as rij = ½ useri ’s rating for itemj , if useri has rated itemj 0 , otherwise , where i ∈ N +, j ∈ M+ and rij ∈ [1, Rmax]. The usual evaluation process is the hold-out cross validation[5]. A certain proportion of ratings are hidden for testing and the rest are used for training. The measures of evalua￾tion include complexity and accuracy. Nevertheless, the accuracy is much more important because most of the Collaborative Filtering approaches are offline. There￾fore, it is the focus in this paper. Naive Estimates One of the most instinctive predicting methods is to compute the mean values. Taking the user’s and the item’s average biases involved, we get the naive estimate [8]: bij = µ + bi + bj , (1) where bij indicate the predicted rating of useri on itemj ; µ is the global average rating; bi and bj denote useri ’s and itemj ’s average bias, respectively. This naive method is effective and scalable, but it does not take the interaction between users into account. Ev￾ery user’s rating for a item has influences on other users’ opinions to that item. This interdependence between the users forms a social network [24] which connects all users together. The personalized recommendations are not delivered in isolation, but in the context of this so￾cial network [14]. The neighborhood method is one of the most effective methods to analyze the context. Neighborhood Method The aim of the neighborhood method [2] is to find the users who give similar ratings and the items which re￾ceive similar ratings. The approximate ratings infer the potential similarity of the future ratings. This is the basic assumption of collaborative filtering. Because the neighborhood method digs out from the neighbors the clues that indicate the potential ratings, it produces better predictions than the naive estimate. The model of the neighborhood method unifying item-based and user-based collaborative filtering approaches is rˆij = bij+ X h∈Sk(j;i) θ i hj (rih−bih)+ X h∈Sk(i;j) θ j ih(rhj−bhj ), (2) where ˆrij is the predicted rating; bij refers to the naive estimate’s prediction; S k (j;i) denotes the set includ￾ing k nearest rated items neighboring with itemj for a given useri and rhj ; S k (i; j) denotes the set including k nearest users neighboring with useri for a given itemj and rih; θ reflects the different weights of rih. There are several representations for the weights. The cosine similarity is one the most effective measures to indicate the different weights. Cosine Similarty = a · b ||a||2 ||b||2 , where a and b are both vectors with the same dimension
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有