in a chromosome. Here, we set the con_中国高校课件下载中心

点击下载：《电子商务 E-business》阅读文献：A recommender system using GA K-means clustering in an online shopping market

正在加载图片...

K.j. Kim, H. Ahn/Expert Systems with Applications 34(2008)1200-1209 in a chromosome. Here, we set the continuous feature val- at 0. 1. This study performs the crossover using a uniform es as precise as 1/10,000. When we apply min-max crossover routine. The uniform crossover method is consid- normalization to the continuous features, they become ered better at preserving the schema, and can generate any the values ranging from 0 to 1. In this case, 14 binary bits schema from the two parents, while single-point and two- are required to express them with 1/10,000th precision point crossover methods may bias the search with the irrel- ecause 8192=2<10000< 2=16384. These 14 bit evant position of the features. For the mutation method binary numbers are transformed into decimal floating this study generates a random number between 0 and 1 numbers, ranging from 0 to I by applying the following for each of the features in the organism. If a gene gets a equation(4) number that is less than or equal to the mutation rate, then that gene is mutated. As the stopping condition, only 4000 16383 (4) trials(20 generations)are permitted Simple K-means was conducted by SPSS for Windows where x is the decimal number of the binary code for each 11.0 and SOM by Neuroshell 2 R4.0. GA K-means was feature weight(Michalewicz, 1996) developed by using Microsoft Excel 2002 and Palisade For example, the binary code for the value of the 37th Software's Evolver Version 4.06. And, the K-means algo- feature of the first sample chromosome- the chromosome rithm for GA K-means was implemented in VBA (Visual represents Cluster I-in Fig. 2 is(10110010010110)2. The Basic for Applications)of Microsoft Excel 2002. decimal value of it is (11414)10 and it is interpreted as 1=0.696697796≈06967 In this study, we use a real-world data set from an Inter- 4.2. Experimental results net shopping mall that consists of 36 selection variables and 6 continuous variables. For the design of the chromo- In this study, we used real-world Internet shopping mall ome for this data set,(36x 1+6x 14)x K bits are data for assessing the performance of the proposed model required where K is the number of clusters The research data was collected from an online diet portal site in Korea which contains all kinds of services for online 4. Experimental design and results diets such as providing information, community services and a shopping mall. In the case of a diet shopping mall, 4. 1. Experimental design customers generally have a clear objective, and they have a strong demand for personalized care and services. Thus, We adopt three clustering algorithms - simple K-means, they are usually open-minded about providing their per- SOM and GA K-means-to our data. We try to segment sonal information to get appropriate service. As a result, the Internet users into 5 clusters(that is, K= 5). In the case the target company of our research possesses detailed and of SoM, we set the learning rate()at 0.5 accurate customer information. and it has a desire to use For the controlling parameters of the GA search, the the information as a source of direct marketing for selling population size is set at 200 organisms. The value of the their products. Consequently, we tried to build a recom- crossover rate is set at 0.7 while the mutation rate is set mendation model for the users of this web site User Data Base A K-Means Clustering Cluster1Cluster2Cluster3Cluster4Cluster5 Determine the nearest cluster for the target customer Targe Search the most similar neighbors in the corresponding cluste Using Case-Based Reasc nded Generate recommendation results referencing the nearest neighbors purchased items P: Web interfac Fig 3. The system architecture of the recommendation system.in a chromosome. Here, we set the continuous feature values as precise as 1/10,000. When we apply min–max normalization to the continuous features, they become the values ranging from 0 to 1. In this case, 14 binary bits are required to express them with 1/10,000th precision because 8192 = 213 < 10000 6 214 = 16384. These 14 bit binary numbers are transformed into decimal floating numbers, ranging from 0 to 1 by applying the following equation (4): x0 ¼ x 214 1 ¼ x 16383 ð4Þ where x is the decimal number of the binary code for each feature weight (Michalewicz, 1996). For example, the binary code for the value of the 37th feature of the first sample chromosome – the chromosome represents Cluster 1 – in Fig. 2 is (10110010010110)2. The decimal value of it is (11414)10 and it is interpreted as 11414 16383 ¼ 0:696697796 0:6967. In this study, we use a real-world data set from an Internet shopping mall that consists of 36 selection variables and 6 continuous variables. For the design of the chromosome for this data set, (36 · 1+6 · 14) · K bits are required where K is the number of clusters. 4. Experimental design and results 4.1. Experimental design We adopt three clustering algorithms – simple K-means, SOM and GA K-means – to our data. We try to segment the Internet users into 5 clusters (that is, K = 5). In the case of SOM, we set the learning rate (a) at 0.5. For the controlling parameters of the GA search, the population size is set at 200 organisms. The value of the crossover rate is set at 0.7 while the mutation rate is set at 0.1. This study performs the crossover using a uniform crossover routine. The uniform crossover method is considered better at preserving the schema, and can generate any schema from the two parents, while single-point and twopoint crossover methods may bias the search with the irrelevant position of the features. For the mutation method, this study generates a random number between 0 and 1 for each of the features in the organism. If a gene gets a number that is less than or equal to the mutation rate, then that gene is mutated. As the stopping condition, only 4000 trials (20 generations) are permitted. Simple K-means was conducted by SPSS for Windows 11.0 and SOM by Neuroshell 2 R4.0. GA K-means was developed by using Microsoft Excel 2002 and Palisade Software’s Evolver Version 4.06. And, the K-means algorithm for GA K-means was implemented in VBA (Visual Basic for Applications) of Microsoft Excel 2002. 4.2. Experimental results In this study, we used real-world Internet shopping mall data for assessing the performance of the proposed model. The research data was collected from an online diet portal site in Korea which contains all kinds of services for online diets such as providing information, community services and a shopping mall. In the case of a diet shopping mall, customers generally have a clear objective, and they have a strong demand for personalized care and services. Thus, they are usually open-minded about providing their personal information to get appropriate service. As a result, the target company of our research possesses detailed and accurate customer information, and it has a desire to use the information as a source of direct marketing for selling their products. Consequently, we tried to build a recommendation model for the users of this web site. Fig. 3. The system architecture of the recommendation system. 1204 K.-j. Kim, H. Ahn / Expert Systems with Applications 34 (2008) 1200–1209

<<向上翻页向下翻页>>

点击下载：《电子商务 E-business》阅读文献：A recommender system using GA K-means clustering in an online shopping market