第 36 卷第 11 期 2014 年 11 月北京科技大学学报 Jo

正在加载图片...

第36卷第11期北京科技大学学报 Vol.36 No.11 2014年11月 Journal of University of Science and Technology Beijing Now.2014 一种基于密度的模糊自适应聚类算法王玲12)四，吴璐璐12)，付冬梅 1)北京科技大学自动化学院，北京1000832)北京科技大学钢铁流程先进控制教育部重点实验室，北京100083 ☒通信作者，E-mail:linda gh@sina.com 摘要针对密度聚类算法对邻域参数设置敏感的问题，提出一种基于密度的模糊自适应聚类算法.算法在无需预先设置聚类数以及邻域参数的情况下，可以自适应地根据样本间距离关系确定邻域半径得到样本密度，并根据样本密度逐渐增加聚类中心.为了保障聚类结果的正确性，同时提出一种新的模糊聚类有效性指标以判断最佳聚类数，消除了密度聚类算法对参数的敏感性.用UCI基准数据集进行实验，发现本文算法在对数据进行聚类时，聚类质量较原始密度聚类算法在准确性和自适应性方面均有显著提高关键词聚类算法：模糊聚类：自适应：密度分类号TP18 A density-based fuzzy adaptive clustering algorithm WANG Ling),WU Lu-Hu2,FU Dong-mei2) 1)School of Automation and Electrical Engineering,University of Science and Technology Beijing,Beijing 100083,China 2)Key Laboratory of Advanced Control of Iron and Steel Process (Ministry of Education),Beijing 100083,China Corresponding author,E-mail:linda_gh@sina.com ABSTRACT In order to solve the problem that the density clustering algorithm is sensitive to neighborhood parameters,this article introduces a density-based fuzzy adaptive clustering algorithm.Without predefined clustering number and neighborhood parameters, this algorithm adaptively determines the radius of neighborhood to obtain the density of each sample and increases cluster centers based on the density.A new validity measure for fuzzy clustering is proposed to choose the best clustering number so that the sensitivity of density clustering is eliminated.UCI benchmark data sets are used to compare the proposed algorithm and the traditional density cluste- ring algorithm.Experiment results demonstrate that the proposed algorithm improves the clustering accuracy and the adaptability effec- tively. KEY WORDS clustering algorithms;fuzzy clustering:adaptive:density 在众多的聚类算法中，基于密度的聚类算法由法-习是一种经典的基于密度的聚类算法，虽然该于具有能够发现任意形状的聚类，可伸缩性好等优算法可以获得任意形状的聚类簇，但是需要预先设点，在许多领域有着重要的应用.然而，绝大部分密置邻域半径Eps(Epsilon neighborhood)和邻域范围度聚类算法对输入参数敏感而且无法保证最终聚类内最少对象数min Pts(minimum number of points) 结果正确性，这些缺点在一定程度上限制了密度聚这两个参数，而且该算法的聚类结果对参数值敏感类算法的应用. 为了改进DBSCAN聚类算法，文献B-4]先假定基于密度的聚类算法主要思想是根据数据样本 min Pts参数，然后再分别采用遗传算法和距离排序的稠密程度产生聚类簇，其中DBSCAN(density- 来估计E即s.虽然该算法在一定程度上可以降低密 based spatial clustering of applications with noise) 度聚类算法对参数设置的敏感性，但是min Pts参数收稿日期：201405-28 基金项目：中央高校基本科研业务费资助项目(FRF-SD-]2-OO9B);北京科技大学研究生教有发展基金资助项目 DOI:10.13374/j.issn1001-053x.2014.11.020:http://journals.ustb.edu.cn第 36 卷第 11 期 2014 年 11 月北京科技大学学报 Journal of University of Science and Technology Beijing Vol． 36 No． 11 Nov． 2014 一种基于密度的模糊自适应聚类算法王玲1，2) ，吴璐璐1，2) ，付冬梅1，2) 1) 北京科技大学自动化学院，北京 100083 2) 北京科技大学钢铁流程先进控制教育部重点实验室，北京 100083  通信作者，E-mail: linda_gh@ sina． com 摘要针对密度聚类算法对邻域参数设置敏感的问题，提出一种基于密度的模糊自适应聚类算法．算法在无需预先设置聚类数以及邻域参数的情况下，可以自适应地根据样本间距离关系确定邻域半径得到样本密度，并根据样本密度逐渐增加聚类中心．为了保障聚类结果的正确性，同时提出一种新的模糊聚类有效性指标以判断最佳聚类数，消除了密度聚类算法对参数的敏感性．用 UCI 基准数据集进行实验，发现本文算法在对数据进行聚类时，聚类质量较原始密度聚类算法在准确性和自适应性方面均有显著提高．关键词聚类算法; 模糊聚类; 自适应; 密度分类号 TP 18 A density-based fuzzy adaptive clustering algorithm WANG Ling1，2)  ，WU Lu-lu1，2) ，FU Dong-mei1，2) 1) School of Automation and Electrical Engineering，University of Science and Technology Beijing，Beijing 100083，China 2) Key Laboratory of Advanced Control of Iron and Steel Process ( Ministry of Education) ，Beijing 100083，China  Corresponding author，E-mail: linda_gh@ sina． com ABSTＲACT In order to solve the problem that the density clustering algorithm is sensitive to neighborhood parameters，this article introduces a density-based fuzzy adaptive clustering algorithm． Without predefined clustering number and neighborhood parameters， this algorithm adaptively determines the radius of neighborhood to obtain the density of each sample and increases cluster centers based on the density． A new validity measure for fuzzy clustering is proposed to choose the best clustering number so that the sensitivity of density clustering is eliminated． UCI benchmark data sets are used to compare the proposed algorithm and the traditional density clustering algorithm． Experiment results demonstrate that the proposed algorithm improves the clustering accuracy and the adaptability effectively． KEY WOＲDS clustering algorithms; fuzzy clustering; adaptive; density 收稿日期: 2014--05--28 基金项目: 中央高校基本科研业务费资助项目( FＲF--SD--12--009B) ; 北京科技大学研究生教育发展基金资助项目 DOI: 10． 13374 /j． issn1001--053x． 2014． 11． 020; http: / /journals． ustb． edu． cn 在众多的聚类算法中，基于密度的聚类算法由于具有能够发现任意形状的聚类，可伸缩性好等优点，在许多领域有着重要的应用．然而，绝大部分密度聚类算法对输入参数敏感而且无法保证最终聚类结果正确性，这些缺点在一定程度上限制了密度聚类算法的应用．基于密度的聚类算法主要思想是根据数据样本的稠密程度产生聚类簇，其中 DBSCAN ( densitybased spatial clustering of applications with noise) 算法［1 － 2］是一种经典的基于密度的聚类算法，虽然该算法可以获得任意形状的聚类簇，但是需要预先设置邻域半径 Eps ( Epsilon neighborhood) 和邻域范围内最少对象数 min Pts ( minimum number of points) 这两个参数，而且该算法的聚类结果对参数值敏感．为了改进 DBSCAN 聚类算法，文献［3 － 4］先假定 min Pts 参数，然后再分别采用遗传算法和距离排序来估计 Eps．虽然该算法在一定程度上可以降低密度聚类算法对参数设置的敏感性，但是 min Pts 参数

向下翻页>>

点击下载：一种基于密度的模糊自适应聚类算法