正在加载图片...
410 Kyoung-jae Kim and Hyunchul Ahn performance criteria. Section 3 proposes the Ga approach to optimize the K-means lustering and section 4 describes the data and the experiments. In this section, the empirical results are also summarized and discussed. In the final section, conclusions and the limitations of this study are presented 2 Clustering Algorithms Cluster analysis is an effective tool in scientific or managerial inquiry. It groups a set of data in d-dimensional feature space to maximize the similarity within the clusters and minimize the similarity between two different clusters. There exists various clus- tering methods and they are currently used in wide area. Among them, we apply two popular methods, K-means and SOM, and a novel hybrid method to the market seg- mentation. Above all. we introduce K-means and Som in this section. For brief de- scription of each method, we assume that given population is consisted of n elements described by m attributes and it would be partitioned into k clusters. And, X=(u, x,,,xim)represents the vector of the m attributes of element i 2.1 K-Means Clustering algorith K-means method is a widely used clustering procedure that searches for a nearl optimal partition with a fixed number of clusters. The process of K-means clusterin is as follows: 1)The initial seeds with the chosen number of clusters, k, are selected and an initial partition is built by using the seeds as the centroids of the initial clusters 2) Each record is assigned to the centroid which is nearest, thus forming cluster 3)Keeping the same number of clusters, the new centriod of each cluster is calcu- 4)Iterate Step 2)and 3)until the clusters stop changing or stop conditions are satis- K-means algorithm has been popular because of its easiness and simplicity for ap- plication. However, it also has some of shortcomings. First, it does not deal well with cluotapping clusters and the clusters can be pulled of center by outliers. And, the clustering result may depend on the initial seeds but there exists no mechanism to optimize the initial seed 2.2 Self-organizing Map The SOM is the clustering algorithm based on unsupervised neural network model. Since it is suggested by Kohonen [7]. it has been applied to many studies because of its good performance. The basic SOM model consists of two layers, an input layer and an output layer. When the training set is presented to the network, the values flow forward through the network to units in the output layer. The neurons in the output layer are arranged in a grid, and the unit in the output layer competes with each other and the one with the highest value wins. The process of SoM is as follows410 Kyoung-jae Kim and Hyunchul Ahn performance criteria. Section 3 proposes the GA approach to optimize the K-means clustering and section 4 describes the data and the experiments. In this section, the empirical results are also summarized and discussed. In the final section, conclusions and the limitations of this study are presented. 2 Clustering Algorithms Cluster analysis is an effective tool in scientific or managerial inquiry. It groups a set of data in d-dimensional feature space to maximize the similarity within the clusters and minimize the similarity between two different clusters. There exists various clus￾tering methods and they are currently used in wide area. Among them, we apply two popular methods, K-means and SOM, and a novel hybrid method to the market seg￾mentation. Above all, we introduce K-means and SOM in this section. For brief de￾scription of each method, we assume that given population is consisted of n elements described by m attributes and it would be partitioned into k clusters. And, ( , ,..., ) i i1 i2 im X = x x x represents the vector of the m attributes of element i. 2.1 K-Means Clustering Algorithm K-means method is a widely used clustering procedure that searches for a nearly optimal partition with a fixed number of clusters. The process of K-means clustering is as follows: 1) The initial seeds with the chosen number of clusters, k, are selected and an initial partition is built by using the seeds as the centroids of the initial clusters. 2) Each record is assigned to the centroid which is nearest, thus forming cluster. 3) Keeping the same number of clusters, the new centriod of each cluster is calcu￾lated. 4) Iterate Step 2) and 3) until the clusters stop changing or stop conditions are satis￾fied. K-means algorithm has been popular because of its easiness and simplicity for ap￾plication. However, it also has some of shortcomings. First, it does not deal well with overlapping clusters and the clusters can be pulled of center by outliers. And, the clustering result may depend on the initial seeds but there exists no mechanism to optimize the initial seeds. 2.2 Self-organizing Map The SOM is the clustering algorithm based on unsupervised neural network model. Since it is suggested by Kohonen [7], it has been applied to many studies because of its good performance. The basic SOM model consists of two layers, an input layer and an output layer. When the training set is presented to the network, the values flow forward through the network to units in the output layer. The neurons in the output layer are arranged in a grid, and the unit in the output layer competes with each other and the one with the highest value wins. The process of SOM is as follows:
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有