正在加载图片...
Here R, is the score vector rated by i-th user in tho algorithm. The result of the experiment 1 is showed i mensional resource-space. We define the similarity between users i and j as sim(i, j) which is computed as 0.25 w*R sin(,)=m+(1-1( (10) 15 Here w(0<w<I) is a linear weight coefficient, which is specified in the application environment(the variable w in this paper is equal to 0.4) Simi-New similarity algorithm can offset the drawbacks of the above three commonly used similarity algorithms cosine-hased corre at ion- adjusted Given the score vectors rated by user x, y and z correspondingly x=0,0.5,), y=0,0, 1, 0) and Figure 2. MAE of four similarity algorithms z=0, 0, 4,0). As x is parallel to y and y is parallel to z From Figure 2, we can see that the Simi- New algorithm so the similarity between x and y and similarity between y improved by this paper outperforms other three similarity and z can't be distinguished by cosine- based similarity algorithms in the precision ofpreu:ng, In experiment 2,we algorithm. However, the similarity between them can be select the same data set as experiment 1 and compare the computed by using the Simi-New similarity algorithm. The coverage of four similarity algorithms. The result of the similarity between the users x and y is experiment 2 is showed in Figure 3 sim(x, y)=- and the similarity between users 1.00 sim(x, z) +(1-w)*20 Obviously, from the above computation we can find out sim(, z)>sim(r,y) which accords with the real condition given the score vectors rated by users x, y and z are correspondingly x= 30.40 {0,5,50},y={0,5,1,0)andz={0,54,2}. As the vector x 0.20 is a self-equal vector, so the similarity between x and y and the similarity between x and z can't be computed by correlation-based similarity and adjusted-cosine similarity i ne-based corrclation- usted Sini-New However, these similarities can be counted and differed by Cosine using Simi-New similarity algorithm. The Simi-New similarity algorithm can not only overcome the problem of Figure 3. Coverage of four similarity algorithms. self-equal vector, but also can adapt to different applications by using adjusted parameter w From Figure 3, we can see that the Simi-New similarity algorithm is little better than cosine-based similarity V. EXPERIMENTAL RESULTS AND ANALYSIS algorithm in coverage of recommendation and much better than adjusted-cosine similarity algorithm and correlation- In this article all the experimental data used are collected based similarity algorithm. In a word, the Simi-New (http://ww.nimrf.netcn),andweselectfivemonthsWebsimilarityalgorithmsintheprecisionandcoverageofthe 2695 users who view 75702 pages by 27673 visits B. Experiments Of The Ontology-Based User Model A. Experiments Of The Similarity Algorithms Experiment 3 and 4 are used to compare the ontology In experiment 1, we select 5418 records visited by 335 as based user model with the user-resource matrix-based user the training data, and select 2051 records as the test set to model to verify the accuracy and validi of the first use compute Mean Absolute Error (MAE). We choose the User- model In experiment 3, we select 5418 records from the first Based Collaborative Filtering Algorithm as the predicting four months Web logs as the training data, and select 2051 records from the last month Web logs as the test set. ThereHere Ri is the score vector rated by i -th user in the n dimensional resource-space. We define the similarity between users i and j as sim(i, j) which is computed as following: . C' .) w Slm l,} =1"ii""f1+ *Ry C1 -wJ ;-"S(") l,} e S 1.1 (10) Here wCO < w < 1) is a linear weight coefficient, which is specified in the application environment (the variable w in this paper is equal to 0.4). Simi-New similarity algorithm can offset the drawbacks of the above three commonly used similarity algorithms. Given the score vectors rated by user x, y and z are correspondingly x ={0,0,5,0}, y ={0,0,1,0} and ; ={0,0,4,0}. As � is parallel to y and y is parallel to ; , so the similarity between x and y and similarity between y and z can't be distinguished by cosine-based similarity algorithm. However, the similarity between them can be computed by using the Simi-New similarity algorithm. The similarity between the users x and y is . w 5 slm(x , y) = -- + (1- w) * - and the similarity 3*e4 21 between the users x and z is w 20 sim( x, z) = -- + (1- w) * - Obviously, from the 3 * e 21 above computation we can find out sim(x, z) > sim(x, y) , which accords with the real condition. Given the score vectors rated by users x, y and z are correspondingly x = {0,5,5,0}, Y = {0,5,1,0} and; = {0,5,4,2}. As the vector � is a self-equal vector, so the similarity between x and y and the similarity between x and z can't be computed by correlation-based similarity and adjusted-cosine similarity. However, these similarities can be counted and differed by using Simi-New similarity algorithm. The Simi-New similarity algorithm can not only overcome the problem of self-equal vector, but also can adapt to different applications by using adjusted parameter w . V. EXPERIMENTAL RESULTS AND ANALYSIS In this article all the experimental data used are collected from the rock and mineral fossils resources site (http://www.nimrf.net.cn). and we select five months Web logs from July 1, 2009 to November 3l. These logs contain 2695 users who view 75702 pages by 27673 visits. A. Experiments Of The Similarity Algorithms In experiment 1, we select 5418 records visited by 335 as the training data, and select 2051 records as the test set to compute Mean Absolute Error (MAE). We choose the User￾Based Collaborative Filtering Algorithm as the predicting 366 algorithm. The result of the experiment 1 is showed in Figure 2. 0.30 0.25 - r-- 0.20 - --- --- ,...-- ""' O. 15 t- --- --- t- "" "" O. 10 r-- --- --- I- 0.05 - --- --- --- 0.00 cos i ne-based corre 1 at i on- adjusted Simi-New based cosi ne Figure 2. MAE of four similarity algorithms. From Figure 2, we can see that the Simi-New algorithm improved by this paper outperforms other three similarity algorithms in the precision of predicting. In experiment 2, we select the same data set as experiment 1 and compare the coverage of four similarity algorithms. The result of the experiment 2 is showed in Figure 3. 1. 00 ,--- .----- 0.80 l- i- '" � O. 60 - H ,--- '" � 0.40 - --- --- --- u 0.20 r-- -- --- -- 0.00 cosine-based correlat ion- adjusted Simi-New based Cosine Figure 3. Coverage of four similarity algorithms. From Figure 3, we can see that the Simi-New similarity algorithm is little better than cosine-based similarity algorithm in coverage of recommendation and much better than adjusted-cosine similarity algorithm and correlation￾based similarity algorithm. In a word, the Simi-New similarity algorithm is superior to other three commonly used similarity algorithms in the precision and coverage of the recommendations. B. Experiments Of The Ontology-Based User Model Experiment 3 and 4 are used to compare the ontology￾based user model with the user-resource matrix-based user model to verify the accuracy and validity of the first user model. In experiment 3, we select 5418 records from the first four months Web logs as the training data, and select 2051 records from the last month Web logs as the test set. There
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有