结果表明 CABON 算法能够有效消减不均衡数据聚类的“均匀效应”. （

正在加载图片...

武森等：基于近邻的不均衡数据聚类算法 .1217 30 30 (a) (b) (c) 30 (d) 25 25 20 20 20 15 15 15 10 15 15 00 102030 40 102030 40 102030 40 102030 40 Attribute 1 Attribute 1 Attribute 1 Attribute 1 图9 Jain数据集不同算法聚类结果图示.(a)CABON:(b)K-means;(c)MC_IK;(d)CVCN Fig.9 Graphical representations of clustering results with different algorithms on Jain data sets:(a)CABON;(b)K-means,(c)MC IK;(d)CVCN 2 20 15 (a) 15 (b) 20 15 (c) 20, 1 (d) 10 10 10 10 5 5 0 0 0 5 5 -10 -10 -10 -10 -15005,10 15 -005 1015 -1510-50,5, -15 1015 10-5051015 Attribute 1 Attribute 1 Attribute 1 Attribute I 图10 DS1数据集不同算法聚类结果图示.(a)CABON:(b)K-means:(c)MCIK:(d)CVCN Fig.10 Graphical representations of clustering results with different algorithms on DSI data sets:(a)CABON;(b)K-means;(c)MC IK;(d)CVCN 20 20 (b) 20 20 (c) 15 (a 15 15 (d) 10 10 0 10 -5 -5 -10 -10 10 -10 00 -15 -20 -20 -20 -20 -25 Cluster 2 -25 -25 15-10-505101520 -15-10-505101520 15-10-505101520 15-10-505101520 Attribute 1 Attribute I Attribute 1 Attribute 1 图11 DS2数据集不同算法聚类结果图示.(a)CABON:(b)K-means:(c)MCIK:(d)CVCN Fig.11 Graphical representations of clustering results with different algorithms on DS2 data sets:(a)CABON:(b)K-means;(c)MC IK;(d)CVCN 20 (a) 20 (b) 20 (c) 20 (d) 10 10 10 0 0 0 0 -10 -10 0 -10 20 汤 3 -30 -30 -30 -30 0 83 3 -4 -40 15-10-5051015 -15-10-505 1015 15-10-5051015 15-10-5051015 Attribute 1 Attribute 1 Attribute I Attribute 1 图12DS3数据集不同算法聚类结果图示.(a)CABON:(b)K-means;(c)MCIK;(d)CVCN Fig.12 Graphical representations of clustering results with different algorithms on DS3 data sets:(a)CABON;(b)K-means;(c)MC IK;(d)CVCN 结果表明CABON算法能够有效消减不均衡数据的NM值略低于MC_IK算法，但是其他三个指标聚类的“均匀效应” 值均高于对比算法；对于Newthyroid、Ionosphere (2)真实数据集实验结果数据集，CABON算法的各个指标值均优于对比算表6~表9给出了CABON、K-means、MC_IK和法；对于Heart数据集，CABON算法聚类精度和 CVCN算法分别对4个真实数据集进行聚类的结 F-measure值略低于CVCN算法，但是NMI值和果，从表6~表9中可以看出，对于UCI上的4个 RI值均高于对比算法.以上结果表明，CABON算真实数据集，CABON算法在大多数情况下明显优法对于UCI中的真实不均衡数据集也能有效解决于其他三种算法.对于Wine数据集而言，CABON算法 “均匀效应”问题，提高聚类效果结果表明 CABON 算法能够有效消减不均衡数据聚类的“均匀效应”. （2）真实数据集实验结果. 表6～表9 给出了CABON、K–means、MC_IK 和 CVCN 算法分别对 4 个真实数据集进行聚类的结果，从表 6～表 9 中可以看出，对于 UCI 上的 4 个真实数据集，CABON 算法在大多数情况下明显优于其他三种算法. 对于Wine 数据集而言，CABON 算法的 NMI 值略低于 MC_IK 算法，但是其他三个指标值均高于对比算法；对于 Newthyroid、 Ionosphere 数据集，CABON 算法的各个指标值均优于对比算法；对于 Heart 数据集，CABON 算法聚类精度和 F-measure 值略低于 CVCN 算法，但是 NMI 值和 RI 值均高于对比算法. 以上结果表明，CABON 算法对于 UCI 中的真实不均衡数据集也能有效解决 “均匀效应”问题，提高聚类效果. Attribute 1 Attribute 2 20 −10 −5 10 15 0 5 5 10 15 (a) 0 −5 −10 −15 Cluster 1 Cluster 2 Attribute 1 Attribute 2 20 −10 −5 10 15 0 5 5 10 15 (b) 0 −5 −10 −15 Cluster 1 Cluster 2 Attribute 1 Attribute 2 20 −10 −5 10 15 0 5 5 10 15 (c) 0 −5 −10 −15 Cluster 1 Cluster 2 Attribute 1 Attribute 2 20 −10 −5 10 15 0 5 5 10 15 (d) 0 −5 −10 −15 Cluster 1 Cluster 2 图 10 DS1 数据集不同算法聚类结果图示. （a） CABON；（b） K–means；（c） MC_IK；（d） CVCN Fig.10 Graphical representations of clustering results with different algorithms on DS1 data sets: (a) CABON; (b) K–means; (c) MC_IK; (d) CVCN Attribute 1 Attribute 2 20 −15 −5 −10 10 15 0 5 5 10 15 20 (a) 0 −5 −10 −15 Cluster 1 Cluster 2 −20 −25 Attribute 1 Attribute 2 20 −15 −5 −10 10 15 0 5 5 10 15 20 (b) 0 −5 −10 −15 Cluster 1 Cluster 2 −20 −25 Attribute 1 Attribute 2 20 −15 −5 −10 10 15 0 5 5 10 15 20 (c) 0 −5 −10 −15 Cluster 1 Cluster 2 −20 −25 Attribute 1 Attribute 2 20 −15 −5 −10 10 15 0 5 5 10 15 20 (d) 0 −5 −10 −15 Cluster 1 Cluster 2 −20 −25 图 11 DS2 数据集不同算法聚类结果图示. （a） CABON；（b） K–means；（c） MC_IK；（d） CVCN Fig.11 Graphical representations of clustering results with different algorithms on DS2 data sets: (a) CABON; (b) K–means; (c) MC_IK; (d) CVCN Attribute 1 Attribute 2 20 −15 −10 −5 10 0 10 5 15 (a) 0 −10 −40 −20 −30 Attribute 1 Attribute 2 20 −15 −10 −5 10 0 10 5 15 (b) 0 −10 −40 −20 −30 Attribute 1 Attribute 2 20 −15 −10 −5 10 0 10 5 15 (c) 0 −10 −40 −20 −30 Attribute 1 Attribute 2 20 −15 −10 −5 10 0 10 5 15 (d) 0 −10 −40 −20 −30 Cluster 1 Cluster 3 Cluster 2 Cluster 4 Cluster 5 Cluster 1 Cluster 3 Cluster 2 Cluster 4 Cluster 5 Cluster 1 Cluster 3 Cluster 2 Cluster 4 Cluster 5 Cluster 1 Cluster 3 Cluster 2 Cluster 4 Cluster 5 图 12 DS3 数据集不同算法聚类结果图示. （a） CABON；（b） K–means；（c） MC_IK；（d） CVCN Fig.12 Graphical representations of clustering results with different algorithms on DS3 data sets: (a) CABON; (b) K–means; (c) MC_IK; (d) CVCN Attribute 1 Attribute 2 30 0 10 25 20 30 20 40 (a) 15 10 15 0 Attribute 1 Attribute 2 30 0 10 25 20 30 20 40 (b) 15 10 15 0 Attribute 1 Attribute 2 30 0 10 25 20 30 20 40 (c) 15 10 15 0 Attribute 1 Attribute 2 30 0 10 25 20 30 20 40 (d) 15 10 15 0 Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 图 9 Jain 数据集不同算法聚类结果图示. （a） CABON；（b） K–means；（c） MC_IK；（d） CVCN Fig.9 Graphical representations of clustering results with different algorithms on Jain data sets: (a) CABON; (b) K–means; (c) MC_IK; (d) CVCN 武森等：基于近邻的不均衡数据聚类算法 · 1217 ·

<<向上翻页向下翻页>>

点击下载：《工程科学学报》：基于近邻的不均衡数据聚类算法