DOI: 10.11992/tis.201904021 网络出版地址: h

正在加载图片...

第15卷第2期智能系统学报 Vol.15 No.2 2020年3月 CAAI Transactions on Intelligent Systems Mar.2020 D0:10.11992tis.201904021 网络出版地址：http:/kns.cnki.net/kcms/detail/23.1538.TP.20190828.1756.008.html 加权PageRank改进地标表示的自编码谱聚类算法储德润，周治平 (江南大学物联网技术应用教育部工程研究中心，江苏无锡214122) 摘要：针对传统谱聚类算法在处理大规模数据集时，聚类精度低并且存在相似度矩阵存储开销大和拉普拉斯矩阵特征分解计算复杂度高的问题。提出了一种加权PageRank改进地标表示的自编码谱聚类算法，首先选取数据亲和图中权重最高的节点作为地标点，以选定的地标点与其他数据点之间的相似关系来逼近相似度矩阵作为叠加自动编码器的输入。然后利用聚类损失同时更新自动编码器和聚类中心的参数，从而实现可扩展和精确的聚类。实验表明，在几种典型的数据集上，所提算法与地标点谱聚类算法和深度谱聚类算法相比具有更好的聚类性能。关键词：机器学习；数据挖掘；聚类分析；地标点聚类；谱聚类；加权PageRank;自动编码器；聚类损失中图分类号：TP18文献标志码：A文章编号：1673-4785(2020)02-0302-08 中文引用格式：储德润，周治平.加权PageRank改进地标表示的自编码谱聚类算法J.智能系统学报，2020,15(2)： 302-309. 英文引用格式：CHU Derun,ZHOU Zhiping..An autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank[Jl.CAAI transactions on intelligent systems,2020,15(2):302-309. An autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank CHU Derun,ZHOU Zhiping (Engineering Research Center of Internet of Things Technology Applications Ministry of Education,Jiangnan University,Wuxi 214122,China) Abstract:Several problems,such as low clustering precision,large memory overhead of the similarity matrix,and high computational complexity of the Laplace matrix eigenvalue decomposition,are encountered when using the traditional spectral clustering algorithm to deal with large-scale datasets.To solve these problems,an autoencoder spectral cluster- ing algorithm for improving landmark representation by weighted PageRank is proposed in this study.First,the nodes with the highest weight in the data affinity graph were selected as the landmark points.The similarity matrix was ap- proximated by the similarity relation between the selected ground punctuation points and other data points.The result was further used as the input of the superimposed automatic encoder.At the same time,the parameters of the automatic encoder and cluster center were updated simultaneously using clustering loss.Thus,extensible and accurate clustering can be achieved.The experimental results show that the proposed autoencoder spectral clustering algorithm has better clustering performance than the landmark and depth spectral clustering algorithms on several typical datasets. Keywords:machine learning;data mining;cluster analysis;landmark spectral clustering;spectral clustering;weighted pagerank;autoencoder;clustering loss 聚类是数据挖掘、模式识别等许多研究领域数据集划分为紧凑的聚类，使聚类内的数据对象中的基本问题之一，聚类分析的目的是将给定的比不同的聚类中的数据对象更加相似。其中谱聚收稿日期：2019-04-09.网络出版日期：2019-08-29. 类可以适应更广泛的几何形状，并检测非凸模式通信作者：储德润.E-mail:CDR0727@163.com 和线性不可分离的簇，而不存在局部最优问题，DOI: 10.11992/tis.201904021 网络出版地址: http://kns.cnki.net/kcms/detail/23.1538.TP.20190828.1756.008.html 加权 PageRank 改进地标表示的自编码谱聚类算法储德润，周治平（江南大学物联网技术应用教育部工程研究中心，江苏无锡 214122）摘要：针对传统谱聚类算法在处理大规模数据集时，聚类精度低并且存在相似度矩阵存储开销大和拉普拉斯矩阵特征分解计算复杂度高的问题。提出了一种加权 PageRank 改进地标表示的自编码谱聚类算法，首先选取数据亲和图中权重最高的节点作为地标点，以选定的地标点与其他数据点之间的相似关系来逼近相似度矩阵作为叠加自动编码器的输入。然后利用聚类损失同时更新自动编码器和聚类中心的参数，从而实现可扩展和精确的聚类。实验表明，在几种典型的数据集上，所提算法与地标点谱聚类算法和深度谱聚类算法相比具有更好的聚类性能。关键词：机器学习；数据挖掘；聚类分析；地标点聚类；谱聚类；加权 PageRank；自动编码器；聚类损失中图分类号：TP18 文献标志码：A 文章编号：1673−4785(2020)02−0302−08 中文引用格式：储德润, 周治平. 加权 PageRank 改进地标表示的自编码谱聚类算法 [J]. 智能系统学报, 2020, 15(2): 302–309. 英文引用格式：CHU Derun, ZHOU Zhiping. An autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank[J]. CAAI transactions on intelligent systems, 2020, 15(2): 302–309. An autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank CHU Derun，ZHOU Zhiping (Engineering Research Center of Internet of Things Technology Applications Ministry of Education, Jiangnan University, Wuxi 214122, China) Abstract: Several problems, such as low clustering precision, large memory overhead of the similarity matrix, and high computational complexity of the Laplace matrix eigenvalue decomposition, are encountered when using the traditional spectral clustering algorithm to deal with large-scale datasets. To solve these problems, an autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank is proposed in this study. First, the nodes with the highest weight in the data affinity graph were selected as the landmark points. The similarity matrix was approximated by the similarity relation between the selected ground punctuation points and other data points. The result was further used as the input of the superimposed automatic encoder. At the same time, the parameters of the automatic encoder and cluster center were updated simultaneously using clustering loss. Thus, extensible and accurate clustering can be achieved. The experimental results show that the proposed autoencoder spectral clustering algorithm has better clustering performance than the landmark and depth spectral clustering algorithms on several typical datasets. Keywords: machine learning; data mining; cluster analysis; landmark spectral clustering; spectral clustering; weighted pagerank; autoencoder; clustering loss 聚类是数据挖掘、模式识别等许多研究领域中的基本问题之一，聚类分析的目的是将给定的数据集划分为紧凑的聚类，使聚类内的数据对象比不同的聚类中的数据对象更加相似。其中谱聚类可以适应更广泛的几何形状，并检测非凸模式和线性不可分离的簇，而不存在局部最优问题，收稿日期：2019−04−09. 网络出版日期：2019−08−29. 通信作者：储德润. E-mail：CDR0727@163.com. 第 15 卷第 2 期智能系统学报 Vol.15 No.2 2020 年 3 月 CAAI Transactions on Intelligent Systems Mar. 2020

向下翻页>>

点击下载：【机器学习】加权PageRank改进地标表示的自编码谱聚类算法