正在加载图片...
ARTICLE N PRESS S K Shinde, U Kulkarni/Expert Systems with Applications xxx(2011)xxx-xox Preprocessing Phase User-Item tating Matrix Normalization Rating Matrix formalized ng Matri Based Clustering Recommendation phasc Navigation Query Hybrid Filtering Content+Collaborative Filtering) y Recommended St Fig. 1. System architecture of CBBCHPRS Running example of rating matrix from jester data set after normalization in the range of o to 0.13 27 0.35 cluster boundaries are governed by the value of bunchi this manual selection of centroids. Whereas, the proposed cluster- factor ing algorithm initially calculates centroids appropriately, this re- (iii) Removal of the bunched patterns in a cluster ults in the proper creation of the clusters. The patterns included by created cluster in the previous step are The proposed clustering algorithm consists of three steps. eliminated. Thus, the next pass uses unclustered pattern set consisting of remaining patterns for clustering. These three terns. These three steps are described below in detail. The number teps are repeated till all the patterns are clustered of clusters constructed depends on the user defined parameters a and B, called as centering and bunching factors, respectively and Let Rp r and ro represent set of patterns used in the current the values of these parameters are problem dependent. Assume pass, set of patterns clustered in the current pass and set of pat- REIRnh=1, 2,. P) where Rh=(hl, Ih2,., Thn)is the n-dimen- terns that will be used in the next pass, respectively. Then Ra can sional hth pattern belonging to the set R containing P patterns Rn=Rp-R=(RnRn ERp and Rn FR h (i Determining the centroid The Rn calculated in the current pass becomes R, for the next pass To determine the centroid of the cluster, all the patterns are The steps described above are repeated until all the patterns are applied to each of the pattern and the patterns having Euclidian distance less than or equal to a are counted for all the patterns clustered and the process stops when Rn becomes empt If Rn is the pattern with the maximum count then it is selected as the centroid of the cluster 4.1.3. Computing centorid of each cluster ii Bunching The proposed CBBC is used for clustering of the jester data set. The patterns which are falling around the centroid and having The clustering resulted in the three clusters with the Euclidian distance less than or equal to B are bunched in a details of the clusters created and users in each cluster are shown cluster. The centroid of the cluster is recalculated by calculating in the table 3. After bunching as stated in the CBBc algorithm the average of all the patterns bunched in a cluster. Thus the knowing the members of each group we have recomputed new Please cite this article in press as: Shinde, S, K,& Kulkarni, U. Hybrid personalized recommender system using centering-bunching based clustering algo- rithm. Expert Systems with Applications(2011). doi: 10.1016/jeswa2011.08.020the user. Therefore, performance of these algorithms depends on this manual selection of centroids. Whereas, the proposed cluster￾ing algorithm initially calculates centroids appropriately, this re￾sults in the proper creation of the clusters. The proposed clustering algorithm consists of three steps, determining the centroids, bunching and removing bunched pat￾terns. These three steps are described below in detail. The number of clusters constructed depends on the user defined parameters a and b, called as centering and bunching factors, respectively and the values of these parameters are problem dependent. Assume R 2 {Rhjh = 1,2,...,P} where Rh = (rh1,rh2,...,rhn) is the n-dimen￾sional hth pattern belonging to the set R containing P patterns to be clustered. (i) Determining the centroid: To determine the centroid of the cluster, all the patterns are applied to each of the pattern and the patterns having Euclidian distance less than or equal to a are counted for all the patterns. If Rh is the pattern with the maximum count then it is selected as the centroid of the cluster. (ii) Bunching: The patterns which are falling around the centroid and having the Euclidian distance less than or equal to b are bunched in a cluster. The centroid of the cluster is recalculated by calculating the average of all the patterns bunched in a cluster. Thus the cluster boundaries are governed by the value of bunching factor. (iii) Removal of the bunched patterns in a cluster: The patterns included by created cluster in the previous step are eliminated. Thus, the next pass uses unclustered pattern set consisting of remaining patterns for clustering. These three steps are repeated till all the patterns are clustered. Let Rp, Rc and Rn represent set of patterns used in the current pass, set of patterns clustered in the current pass and set of pat￾terns that will be used in the next pass, respectively. Then Rn can be described as, Rn ¼ Rp Rc ¼ fRnjRn 2 Rp and Rn R Rcg ð1Þ The Rn calculated in the current pass becomes Rp for the next pass. The steps described above are repeated until all the patterns are clustered and the process stops when Rn becomes empty. 4.1.3. Computing centorid of each cluster The proposed CBBC is used for clustering of the Jester data set. The clustering resulted in the three clusters with a = b = 0.3. The details of the clusters created and users in each cluster are shown in the Table 3. After bunching as stated in the CBBC algorithm, knowing the members of each group, we have recomputed new Fig. 1. System architecture of CBBCHPRS. Table 2 Running example of rating matrix from Jester data set after normalization in the range of 0 to 1. Users J1 J2 J3 J4 J5 J6 J7 J8 J9 J10 U1 0.15 0.94 0.06 0.13 0.16 0.11 0.05 0.72 0.09 0.29 U2 0.71 0.51 0.82 0.73 0.41 0.06 0.48 0.26 0.94 0.96 U3 0.00 0.00 0.00 0.00 0.95 0.96 0.95 0.96 0.00 0.00 U4 0.00 0.92 0.00 0.00 0.60 0.91 0.38 0.81 0.00 0.61 U5 0.92 0.74 0.32 0.26 0.58 0.60 0.85 0.74 0.50 0.79 U6 0.23 0.35 0.54 0.11 0.18 0.31 0.11 0.48 0.20 0.43 U7 0.00 0.00 0.00 0.00 0.93 0.05 0.89 0.94 0.00 0.00 U8 0.84 0.67 0.96 0.22 0.13 0.44 0.96 0.59 0.27 0.31 U9 0.34 0.35 0.07 0.19 0.10 0.51 0.27 0.09 0.14 0.44 U10 0.66 0.76 0.76 0.66 0.82 0.76 0.94 0.64 0.66 0.91 Active User 0.38 0.71 0.00 0.00 0.25 0.00 0.64 0.27 0.00 0.59 0 Indicates not rated. S.K. Shinde, U. Kulkarni / Expert Systems with Applications xxx (2011) xxx–xxx 3 Please cite this article in press as: Shinde, S. K., & Kulkarni, U. Hybrid personalized recommender system using centering-bunching based clustering algo￾rithm. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.08.020
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有