Cluster Analysis ING SHEN SSE TONGJLUNIVERSITY DEC.2016
Cluster Analysis Y I NG SH EN SSE, TO NG JI UNI VERSITY DEC. 2016
Cluster analysis Cluster analysis groups data objects based only on the attributes in the data The main objective is that The objects within a group be similar to one another and They are different from the objects in the other groups 8/20/2016 PATTERN RECOGNITION
Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that ◦ The objects within a group be similar to one another and ◦ They are different from the objects in the other groups. 8/20/2016 PATTERN RECOGNITION 2
Cluster analysis Cluster analysis is important in the following areas Biology Information retrieval o Medicine o Business 8/20/2016 PATTERN RECOGNITION
Cluster analysis Cluster analysis is important in the following areas: ◦ Biology ◦ Information retrieval ◦ Medicine ◦ Business 8/20/2016 PATTERN RECOGNITION 3
Cluster analysis Cluster analysis provides an abstraction from individual data objects to the clusters in which those data objects reside. Some clustering techniques characterize each cluster in terms of a cluster prototype. The prototype is a data object that is representative of the other objects in the cluster. 8/20/2016 PATTERN RECOGNITION
Cluster analysis Cluster analysis provides an abstraction from individual data objects to the clusters in which those data objects reside. Some clustering techniques characterize each cluster in terms of a cluster prototype. The prototype is a data object that is representative of the other objects in the cluster. 8/20/2016 PATTERN RECOGNITION 4
Different types of clusterings We consider the following types of clusterings Partitional versus hierarchica Exclusive versus fuzzy Complete versus partial 8/20/2016 PATTERN RECOGNITION
Different types of clusterings We consider the following types of clusterings ◦ Partitional versus hierarchical ◦ Exclusive versus fuzzy ◦ Complete versus partial 8/20/2016 PATTERN RECOGNITION 5
Partitional versus hierarchical A partitional clustering is a division of the set of data objects into subsets(clusters) a hierarchical clustering is a set of nested clusters that are organized as a tree Each node (cluster)in the tree except for the leaf nodes)is the union of its children(sub-clusters) The root of the tree is the cluster containing all the objects Often, but not always the leaves of the tree are singleton clusters of individual data objects 8/20/2016 PATTERN RECOGNITION
Partitional versus hierarchical A partitional clustering is a division of the set of data objects into subsets (clusters). A hierarchical clustering is a set of nested clusters that are organized as a tree. Each node (cluster) in the tree (except for the leaf nodes) is the union of its children (sub-clusters). The root of the tree is the cluster containing all the objects. Often, but not always, the leaves of the tree are singleton clusters of individual data objects. 8/20/2016 PATTERN RECOGNITION 6
Partitional versus hierarchical The following figures form a hierarchical (nested clustering with 1, 2, 4 and 6 clusters on each level a hierarchical clustering can be viewed as a sequence of partitional clusterings a partitional clustering can be obtained by taking any member of that sequence, i.e. by cutting the hierarchical tree at a certain level 8/20/2016 PATTERN RECOGNITION
Partitional versus hierarchical The following figures form a hierarchical (nested) clustering with 1, 2, 4 and 6 clusters on each level. A hierarchical clustering can be viewed as a sequence of partitional clusterings. A partitional clustering can be obtained by taking any member of that sequence, i.e. by cutting the hierarchical tree at a certain level. 8/20/2016 PATTERN RECOGNITION 7
Partitional versus hierarchical (a) Original points (b)Two clusters x十+ (c)Four clusters (d)Six clusters 8/20/2016 PATTERN RECOGNITION
Partitional versus hierarchical 8/20/2016 PATTERN RECOGNITION 8
Exclusive versus fuzzy In an exclusive clustering, each object is assigned to a single cluster. However, there are many situations in which a point could reasonably be placed in more than one cluster 8/20/2016 PATTERN RECOGNITION 9
Exclusive versus fuzzy In an exclusive clustering, each object is assigned to a single cluster. However, there are many situations in which a point could reasonably be placed in more than one cluster. 8/20/2016 PATTERN RECOGNITION 9
Exclusive versus fuzzy In a fuzzy clustering, every object belongs to every cluster with a membership weight that is between o(absolutely does not belong) and 1 absolutely belongs This approach is useful for avoiding the arbitrariness of assigning an object to only one cluster when it is close to several a fuzzy clustering can be converted to an exclusive clustering by assigning each object to the cluster in which its membership value is the highest 8/20/2016 PATTERN RECOGNITION
Exclusive versus fuzzy In a fuzzy clustering, every object belongs to every cluster with a membership weight that is between ◦ 0 (absolutely does not belong) and ◦ 1 (absolutely belongs). This approach is useful for avoiding the arbitrariness of assigning an object to only one cluster when it is close to several. A fuzzy clustering can be converted to an exclusive clustering by assigning each object to the cluster in which its membership value is the highest. 8/20/2016 PATTERN RECOGNITION 10