正在加载图片...
F.P. Lousame and e. sanchez the user-based approach. The algorithm, in its original formulation, generates a list of recommendations for the active user by selecting new items that are similar to the collection of items already rated by the user. As for the user-based approach, the item-based approach consists of two different components: the similarity computation and the prediction computation. There are different ways to compute the similarity between items. Here we present four of these methods: vector similarity, Pearson correlation, adjusted vector similarity and conditional probability-based similarity. Vector similarity. One way to compute the similarity between items is to con- sider each item i as a vector in the m dimensional user space. The similarity between any two items i and j is measured by computing the cosine of the angle between these two vectors: k∈U,∩U,Tk,iTkj (10) k∈ VinU,k,iv2k∈U∩U,k,j where the summation is extended to users who rated both of the items. k e U∩U Pearson correlation. Similarly to equation 3, the Pearson correlation between items i and j is given by: ∑k∈Unu,(Tk,x-元)(Tk,;-) (11) eLnU,(rk-7)21/∑k∈Unu,(rk-示)2 where Fi and Fi denote the average rating of items i and 3, respectively Adjusted vector similarity. Computing similarity between items using the vector similarity has one important draw back: the difference in the rating scale between different users is not taken into account. This similarity measure addresses this problem by subtracting the corresponding user average from each rating Wii= (12) EkeunU, (rk i-k)2 Ekeu, nu, (Tkj-ik)2 Conditional probability-based similarity. An alternative way to compute the ilarity between each pair of items is to use a measure based on the condi probability of selecting one of the items given that the other item was selected This probability can be expressed as the number of users that selected both items i,j divided by the total number of users that selected item 1{uk:i,j∈Rk} un=P(1)={k:1∈R 13) Note that this similarity measure is not symmetric: PGi)+ P(il])88 F.P. Lousame and E. S´anchez the user-based approach. The algorithm, in its original formulation, generates a list of recommendations for the active user by selecting new items that are similar to the collection of items already rated by the user. As for the user-based approach, the item-based approach consists of two different components: the similarity computation and the prediction computation. There are different ways to compute the similarity between items. Here we present four of these methods: vector similarity, Pearson correlation, adjusted vector similarity and conditional probability-based similarity. Vector similarity. One way to compute the similarity between items is to con￾sider each item i as a vector in the m dimensional user space. The similarity between any two items i and j is measured by computing the cosine of the angle between these two vectors: wi,j = k∈Ui∩Uj rk,irk,j k∈Ui∩Uj r2 k,i k∈Ui∩Uj r2 k,j (10) where the summation is extended to users who rated both of the items, k ∈ Ui ∩ Uj . Pearson correlation. Similarly to equation 3, the Pearson correlation between items i and j is given by: wi,j = k∈Ui∩Uj (rk,i − r¯i)(rk,j − r¯j ) k∈Ui∩Uj (rk,i − r¯i)2 k∈Ui∩Uj (rk,j − r¯j )2 (11) where ¯ri and ¯rj denote the average rating of items i and j, respectively. Adjusted vector similarity. Computing similarity between items using the vector similarity has one important drawback: the difference in the rating scale between different users is not taken into account. This similarity measure addresses this problem by subtracting the corresponding user average from each rating: wi,j = k∈Ui∩Uj (rk,i − r¯k)(rk,j − r¯k) k∈Ui∩Uj (rk,i − r¯k)2 k∈Ui∩Uj (rk,j − r¯k)2 (12) Conditional probability-based similarity. An alternative way to compute the sim￾ilarity between each pair of items is to use a measure based on the conditional probability of selecting one of the items given that the other item was selected. This probability can be expressed as the number of users that selected both items i, j divided by the total number of users that selected item i: wi,j = P(j|i) = | {uk : i, j ∈ Rk} | | {uk : i ∈ Rk} | (13) Note that this similarity measure is not symmetric: P(j|i) = P(i|j).
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有