The neighborhood method find itemj ’s_中国高校课件下载中心

点击下载：《电子商务 E-business》阅读文献：Improving Collaborative Filtering with Tag-Based Neighborhood Method

正在加载图片...

The neighborhood method find itemi's k nearest neigh- CF recommendation methods. In the background of bors(k-NN). These neighbors infer the potential value electronic commerce and video on demand, proper item of ri to different degree according to their similarit recommendations are better since the items are over with item,. Although there are several different simi- whelmingly numerous larity measures employed to compute the similarity be- tween the items, the similarity between items is repre nted by the distance between their rating vectors. The Topic Finding hilarity of the items which have less common raters As with the rating data, the tag data can be represented are structurally lower. If there're high level features ex- as a n x m sparse matrix T given n users and m items tracted to represent the user and the item, the similarity can be better measured this way. Matrix factorization ti, =fuser, 's tags for items, if user: has tagged item,j methods learn this lesson The users are allowed to give more than one tag to each Matrix Factorization item. So if the tags are clearly separated, T becomes To extract high level feature, matrix factorization meth a three-dimensional tensor The three dimensions are ods try to find the rating matrix's low rank approxi user, item. and tag. This is a tough case to take care of mations [15, 21]. They focus on fitting the user-item and it is why there is little work on extract preference rating matrix by low-rank approximation and use the information from this data resource. Innovatively, we fitting result to make sequent predictions 6, 7, 8, 16 divide T into user-tag and item tag matrices represent- The premise behind this low-dimensional factor model ng the tags given by the users and the tags received is that there is only a small number of factors or features oy the items, respectively. The user-tag and item-tag influencing preferences, and that a user's preference for matrices are denoted as TU and T which are defined an item is only determined by that user's feature vector as follows: and that item's feature vector [t1,t2 What is related to our work is not the basic matrix factorization methods. Recently, some matrix factoriza tion methods which involve auxiliary information analy- where tu denotes the tags user has given, and ti de is draw our attention. [13 proposes an trust-aware col notes the tags item has been given laborative filtering algorithm. The algorithm is based on the general knowledge that people normally ask friends When the original tensor T is converted into bags of for recommendations. Due to the memory-based model words in T and T, we can apply LDa 3 to find this algorithm suffers from huge online costs. Trust latent topic information therein. Processing T and values need to be computed like similarity measures. TI separately, we find their latent topics in the form of SoRec [10 fuses the existed trust-based approach with probability. Probabilistic Matrix Factorization(PMF)[16. This 0g= p(topic=jjuser=i) methods is model-based, but it cannot be widely ap- plied due to the scarce resource of trust information 0i=p(topic= litem= i) which involves people's privacy. 9 proposes a relation regularized matrix factorization method for relational where oi denotes user ' probability of preferring for data analysis. Yet it is designed for making recommen- topic and Bi; denotes itemi' probability of being re- dations concerning objects that have both content and lated to the topicj. This is a kind of " soft clustering links. The idea of Collective Matrix Factorization 19 It is possible that a user or item is under multiple top- innovative: factorizing multiple matrices simultane- ics with the same probability. The similarity between ously with shared parameters. The weakness of this these row vectors in e and e more appropriately re- method is that the parameter learning process is com- flects the users'and items'similarity because clustering putationally costly is based on the semantics of the tags The matrices e and e are not directly used for rating prediction TAG-BASED ITEM RECOMMENDATION Because they are full matrices which are not appropri- Since tags and ratings are two of the most attributes ate for computation, we set a threshold value to reserve attached to items, we propose a generalized neighbor hood recommendation method to make use of them in mportant reason for this process is that most of the the same time. Our work is based on the assumption users and items are indirectly related with each other that the behavior of tagging and rating share the same in reality notivation: item classification. In this sense. the la tent preference information found in tagging data ha After finding the matrices e and e, it is easy to em- more power than that in rating data. Regarding tags, ploy k-nN clustering to find the groups whose members there are two types of recommendation: item recom- share the same interests or attributes mendation and keyword recommendation. Our Conce is item recommendation which is the same with most Rating PredictionThe neighborhood method find itemj ’s k nearest neighbors (k-NN). These neighbors infer the potential value of rij to different degree according to their similarity with itemj . Although there are several different similarity measures employed to compute the similarity between the items, the similarity between items is represented by the distance between their rating vectors. The similarity of the items which have less common raters are structurally lower. If there’re high level features extracted to represent the user and the item, the similarity can be better measured this way. Matrix factorization methods learn this lesson. Matrix Factorization To extract high level feature, matrix factorization methods try to find the rating matrix’s low rank approximations [15, 21]. They focus on fitting the user-item rating matrix by low-rank approximation and use the fitting result to make sequent predictions [6, 7, 8, 16]. The premise behind this low-dimensional factor model is that there is only a small number of factors or features influencing preferences, and that a user’s preference for an item is only determined by that user’s feature vector and that item’s feature vector. What is related to our work is not the basic matrix factorization methods. Recently, some matrix factorization methods which involve auxiliary information analysis draw our attention. [13] proposes an trust-aware collaborative filtering algorithm. The algorithm is based on the general knowledge that people normally ask friends for recommendations. Due to the memory-based model, this algorithm suffers from huge online costs. Trust values need to be computed like similarity measures. SoRec [10] fuses the existed trust-based approach with Probabilistic Matrix Factorization (PMF) [16]. This methods is model-based, but it cannot be widely applied due to the scarce resource of trust information which involves people’s privacy. [9] proposes a relation regularized matrix factorization method for relational data analysis. Yet it is designed for making recommendations concerning objects that have both content and links. The idea of Collective Matrix Factorization [19] is innovative: factorizing multiple matrices simultaneously with shared parameters. The weakness of this method is that the parameter learning process is computationally costly. TAG-BASED ITEM RECOMMENDATION Since tags and ratings are two of the most attributes attached to items, we propose a generalized neighborhood recommendation method to make use of them in the same time. Our work is based on the assumption that the behavior of tagging and rating share the same motivation: item classification. In this sense, the latent preference information found in tagging data has more power than that in rating data. Regarding tags, there are two types of recommendation: item recommendation and keyword recommendation. Our concern is item recommendation which is the same with most CF recommendation methods. In the background of electronic commerce and video on demand, proper item recommendations are better since the items are overwhelmingly numerous. Topic Finding As with the rating data, the tag data can be represented as a n × m sparse matrix T given n users and m items, tij = ½ useri ’s tags for itemj , if useri has tagged itemj null , otherwise. The users are allowed to give more than one tag to each item. So if the tags are clearly separated, T becomes a three-dimensional tensor. The three dimensions are user, item, and tag. This is a tough case to take care of and it is why there is little work on extract preference information from this data resource. Innovatively, we divide T into user-tag and item tag matrices representing the tags given by the users and the tags received by the items, respectively. The user-tag and item-tag matrices are denoted as T U and T I which are defined as follows: T U = [ t1, t2, ..., tn] T , T I = [ t1, t2, ..., tm] T , where tu denotes the tags useru has given, and ti denotes the tags itemi has been given. When the original tensor T is converted into bags of words in T U and T I , we can apply LDA [3] to find latent topic information therein. Processing T U and T I separately, we find their latent topics in the form of probability. θ U ij = p(topic = j|user = i), θ I ij = p(topic = j|item = i), where θ U ij denotes useri ’ probability of preferring for topicj and θ I ij denotes itemi ’ probability of being related to the topicj . This is a kind of “soft clustering”. It is possible that a user or item is under multiple topics with the same probability. The similarity between these row vectors in ΘU and ΘI more appropriately re- flects the users’ and items’ similarity because clustering is based on the semantics of the tags. The matrices ΘU and ΘI are not directly used for rating prediction. Because they are full matrices which are not appropriate for computation, we set a threshold value to reserve high similarity relations and clear the others. Another important reason for this process is that most of the users and items are indirectly related with each other in reality. After finding the matrices ΘU and ΘI , it is easy to employ k-NN clustering to find the groups whose members share the same interests or attributes. Rating Prediction

<<向上翻页向下翻页>>

点击下载：《电子商务 E-business》阅读文献：Improving Collaborative Filtering with Tag-Based Neighborhood Method