正在加载图片...
2.3. Non-numerical information: Jaccard for the similarity between users, which aims to improve the results provided by traditional metrics. As the numerical values of the votes appear to lose relevance in the recommendation process(not in the prediction process), we 2.4. Numerical information: mean squared differend are obliged to search for similarity information between users by looking beyond the specific values of their votes. In this sense, it By unifying the concepts and results set out in Section 2, we find is reasonable to focus our attention on two related aspects the following (1) To not grant too much credibility to the similarity of two (1)The similarity between two users(core of the CF RS)is bein users based only on the similitude of a very limited set of based on numerical metrics of statistical origin(such as ommon items: traditional metrics provide their measure Pearson correlation)which should be applied to continuous of similarity without taking into account whether it has been variables, but which in fact are applied to discrete variables obtained based on few or many items voted by both users: which, for the purposes of recommendation, only contain 2 this way, it is not improbable to find the highest similarity measurements associated with pairs of users with very few (2)We are not making use of non-numerical information of the votes which could be valuable in order to complement the (2) The users who have voted for a large number of items should numerical information and provide a metric which satisfa be compared, in so far as is possible, with other users with torily groups these two sources of information. whom they have a large number of items voted in common, this way. for example, in users with around one thousand The most commonly used metrics(constrained Pearson correla- votes cast, a similarity of 0.78 calculated with some 150 tion, Spearman rank correlation, cosine, Pearson correlation, etc. ommon items is more convincing than a similarity of 0.82 display, to a greater or lesser extent, the deficiencies set out in refer- calculated with 60 common items. In short, the proportion ence to Pearson correlation: however, mean squared differences between the common votes and the total votes should be (MSD), based on the geometrical principles of the Euclidean dis- taken very much into consideration. ance, provide different characteristics which could be suitably com- plemented with Jaccard. This metric has been tested into 35 this The most direct way to quantify the above-mentioned aspects is study highlights the good accuracy results obtained using MSD by using the Jaccard metric [24], which calculates the proportion be- but at the cost of coverage values that make it unviable in general tween the number of items that two users have voted in common RS applications. The following paragraph of the conclusions of [35] and the number of different items that both users have voted for sums up its possibilities: the MSD metric offers very interesting re- in total, i.e. the intersection divided by the union of the items voted. sults due to its different behavior compared to the other two studied In order to design our new metric correctly, we must discover the (cosine and Pearson correlation)and its good results in all aspects impact that Jaccard can have as a factor of similarity between users. except in the coverage, which is undoubtedly its weak point As Jaccard does not operate with the numerical values of the votes, it seems improbable that, on its own, it will be able to have a positive 2.5. Hypothesis mpact on the MAE of the RS, however, it is more apparent that the coverage can improve by selecting users with more common votes The hypothesis on which this paper is based is that a suitable nd therefore, in general, with more votes cast. combination of Jaccard and MSd could complement Jaccard with On the other hand, it is reasonable to suspect that users who the numerical values of the votes, and could mitigate the deficien- have voted for a sufficient proportion of items in common display cies in the coverage entailed in the use of MSD, in such a way that common tastes: we know that the negative ratings are not very their joint use would enable the improvement of the results of tra- widely-used( Figs. 1 and 2)and therefore, we can theorize that part ditional metrics in general and of Pearson correlation in particular of the common absences of yotes between 2 users can mean com which will be used as the metric of reference for which the results on non-positive ratings that they preferred not to cast, likewise, must be improved. we know that most of the votes cast have a positive rating( Figs. 1 Although, a prior, the choice of the similarity measure MSD and 2), and therefore, many votes in common could denote a large seems to be the most suitable, it is advisable to test Jaccard com- number of positive ratings in common. bined not only with MSD, but also with the most common metrics: In order to try to test this theory we have designed the follow- Pearson correlation(PC)(6), cosine ( cos)(7), constrained Pearsons ing experiment: for every possible pair of users of MovieLens 1M correlation( CPC)(8)and Spearman rank correlation (SRO)(9) we have calculated the Jaccard value, the MAE and the coverage In Fig. 5. graph 5A shows the evolution of the mae using the met- obtained by establishing the first user of the pair as the active user rics PC, Jaccard*PC, Jaccard *COS, Jaccard*CPC, Jaccard* SRC and and the second one as their only neighborhood. On the x axis we Jaccard*(1-MSD)applied to the Movielens 1M database. As we represent the possible Jaccard values, represented in the interval expected, the best results are obtained by MSD. Graph 5B shows [0..1 where 0 indicates that there are no items in common and the coverage obtained by applying these same metrics: in this case, 1 indicates that all the items voted are in common COS and MSD give the best results up to k= 300 and SrC gives the L In Fig. 4. graph 4A shows the number of pairs of users(y axis) best results from this value of K In short, the similarity measure ich display the jaccard values indicated on the x axis. as is to MSD is confirmed as the best option in order to test the hypothesis be expected, most of the cases present a very low overlap between in this paper tionship between the increase in the value of Jaccard and the accu- 3. Formalization of the new metric racy obtained in the interval [00. 4, in which the great majority of the cases are grouped together(Graph 4A). Graph 4c shows a di The metric presented in this paper takes as reference to be im- rect relationship between the Jaccard value and an improvement proved the most widely-used metric in user to user memory-based CF: Pearson correlation; however, the operating principals that rule The reasoning given and the results obtained justify the incor- this metric will not be taken as a base, but rather, we will use the poration of the jaccard metric as an integral part of a new metric mean squared difference (MSD)metric, which is much less2.3. Non-numerical information: Jaccard As the numerical values of the votes appear to lose relevance in the recommendation process (not in the prediction process), we are obliged to search for similarity information between users by looking beyond the specific values of their votes. In this sense, it is reasonable to focus our attention on two related aspects: (1) To not grant too much credibility to the similarity of two users based only on the similitude of a very limited set of common items: traditional metrics provide their measure of similarity without taking into account whether it has been obtained based on few or many items voted by both users; this way, it is not improbable to find the highest similarity measurements associated with pairs of users with very few items commonly voted. (2) The users who have voted for a large number of items should be compared, in so far as is possible, with other users with whom they have a large number of items voted in common, this way, for example, in users with around one thousand votes cast, a similarity of 0.78 calculated with some 150 common items is more convincing than a similarity of 0.82 calculated with 60 common items. In short, the proportion between the common votes and the total votes should be taken very much into consideration. The most direct way to quantify the above-mentioned aspects is by using the Jaccard metric [24], which calculates the proportion be￾tween the number of items that two users have voted in common and the number of different items that both users have voted for in total, i.e. the intersection divided by the union of the items voted. In order to design our new metric correctly, we must discover the impact that Jaccard can have as a factor of similarity between users. As Jaccard does not operate with the numerical values of the votes, it seems improbable that, on its own, it will be able to have a positive impact on the MAE of the RS, however, it is more apparent that the coverage can improve by selecting users with more common votes and therefore, in general, with more votes cast. On the other hand, it is reasonable to suspect that users who have voted for a sufficient proportion of items in common display common tastes: we know that the negative ratings are not very widely-used (Figs. 1 and 2) and therefore, we can theorize that part of the common absences of votes between 2 users can mean com￾mon non-positive ratings that they preferred not to cast, likewise, we know that most of the votes cast have a positive rating (Figs. 1 and 2), and therefore, many votes in common could denote a large number of positive ratings in common. In order to try to test this theory, we have designed the follow￾ing experiment: for every possible pair of users of MovieLens 1M we have calculated the Jaccard value, the MAE and the coverage obtained by establishing the first user of the pair as the active user and the second one as their only neighborhood. On the x axis we represent the possible Jaccard values, represented in the interval [0..1], where 0 indicates that there are no items in common and 1 indicates that all the items voted are in common. In Fig. 4, graph 4A shows the number of pairs of users (y axis) which display the Jaccard values indicated on the x axis. As is to be expected, most of the cases present a very low overlap between the items voted by each pair of users. Graph 4B shows a direct rela￾tionship between the increase in the value of Jaccard and the accu￾racy obtained in the interval [0..0.4], in which the great majority of the cases are grouped together (Graph 4A). Graph 4C shows a di￾rect relationship between the Jaccard value and an improvement in the coverage. The reasoning given and the results obtained justify the incor￾poration of the Jaccard metric as an integral part of a new metric for the similarity between users, which aims to improve the results provided by traditional metrics. 2.4. Numerical information: mean squared differences By unifying the concepts and results set out in Section 2, we find the following: (1) The similarity between two users (core of the CF RS) is being based on numerical metrics of statistical origin (such as Pearson correlation) which should be applied to continuous variables, but which in fact are applied to discrete variables which, for the purposes of recommendation, only contain 2 values of use (positive/non-positive). (2) We are not making use of non-numerical information of the votes which could be valuable in order to complement the numerical information and provide a metric which satisfac￾torily groups these two sources of information. The most commonly used metrics (constrained Pearson correla￾tion, Spearman rank correlation, cosine, Pearson correlation, etc.) display, to a greater or lesser extent, the deficiencies set out in refer￾ence to Pearson correlation; however, mean squared differences (MSD), based on the geometrical principles of the Euclidean dis￾tance, provide different characteristics which could be suitably com￾plemented with Jaccard. This metric has been tested into [35]; this study highlights the good accuracy results obtained using MSD, but at the cost of coverage values that make it unviable in general RS applications. The following paragraph of the conclusions of [35] sums up its possibilities: ‘‘the MSD metric offers very interesting re￾sults due to its different behavior compared to the other two studied (cosine and Pearson correlation) and its good results in all aspects except in the coverage, which is undoubtedly its weak point”. 2.5. Hypothesis The hypothesis on which this paper is based is that a suitable combination of Jaccard and MSD could complement Jaccard with the numerical values of the votes, and could mitigate the deficien￾cies in the coverage entailed in the use of MSD, in such a way that their joint use would enable the improvement of the results of tra￾ditional metrics in general and of Pearson correlation in particular, which will be used as the metric of reference for which the results must be improved. Although, a priori, the choice of the similarity measure MSD seems to be the most suitable, it is advisable to test Jaccard com￾bined not only with MSD, but also with the most common metrics: Pearson correlation (PC) (6), cosine (COS) (7), constrained Pearson’s correlation (CPC) (8) and Spearman rank correlation (SRC) (9). In Fig. 5, graph 5A shows the evolution of the MAE using the met￾rics PC, Jaccard  PC, Jaccard  COS, Jaccard  CPC, Jaccard  SRC and Jaccard  (1 MSD) applied to the MovieLens 1M database. As we expected, the best results are obtained by MSD. Graph 5B shows the coverage obtained by applying these same metrics; in this case, COS and MSD give the best results up to K = 300 and SRC gives the best results from this value of K. In short, the similarity measure MSD is confirmed as the best option in order to test the hypothesis in this paper. 3. Formalization of the new metric The metric presented in this paper takes as reference to be im￾proved the most widely-used metric in user to user memory-based CF: Pearson correlation; however, the operating principals that rule this metric will not be taken as a base, but rather, we will use the mean squared difference (MSD) metric, which is much less J. Bobadilla et al. / Knowledge-Based Systems 23 (2010) 520–528 523
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有