DOI: 10.11992/tis.202012043 网络出版地址: h

正在加载图片...

第17卷第2期智能系统学报 Vol.17 No.2 2022年3月 CAAI Transactions on Intelligent Systems Mar.2022 D0L:10.11992tis.202012043 网络出版地址：https:/kns.cnki.net/kcms/detail/23.1538.TP.20210622.0900.002.html 特征自表达和图正则化的鲁棒无监督特征选择陈彤，陈秀宏 (江南大学人工智能与计算机学院，江苏无锡214122) 摘要：为了在揭示数据全局结构的同时保留其局部结构.本文将特征自表达和图正则化统一到同一框架中，给出了一种新的无监督特征选择(unsupervised feature selection,UFS)模型与方法。模型使用特征自表达，用其余特征线性表示每一个特征，以保持特征的局部结构：用基于乙2：范数的图正则化项，在保留数据的局部几何结构的同时可以降低噪声数据对特征选择的影响：除此之外，在权重矩阵上施加了低秩约束，保留数据的全局结构。在6个不同的公开数据集上的实验表明，所给算法明显优于其他5个对比算法，表明了所提出的UFS框架的有效性。关键词：特征选择：鲁棒；图拉普拉斯：特征自表达；低秩约束；无监督：L21范数；降维中图分类号：TP181文献标志码：A 文章编号：1673-4785(2022)02-0286-09 中文引用格式：陈彤，陈秀宏.特征自表达和图正则化的鲁棒无监督特征选择.智能系统学报，2022,17(2)：286-294. 英文引用格式：CHEN Tong,.CHEN Xiuhong.Feature self--representation and graph regularization for robust unsupervised fea- ture selection Jl.CAAI transactions on intelligent systems,2022,17(2):286-294 Feature self-representation and graph regularization for robust unsupervised feature selection CHEN Tong,CHEN Xiuhong (School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi 214122,China) Abstract:In order to reveal the global structure of data and preserve its local structure,this paper proposes a new unsu- pervised feature selection(UFS)method,which puts feature self-representation and graph regularization into the same framework.Specifically,the model uses the self-representation of the features to represent each feature through other features for preserving the local structure of the features.An L2.-norm based graph regularization term is used to re- duce the effect of noisy data on feature selection while preserving the local geometric structure.Furthermore,the model uses a low-rank constraint on the weight matrix to preserve the global structure.Experiments on six different public datasets show that the algorithm is clearly superior to the other five algorithms,which demonstrates the effectiveness of the proposed UFS framework. Keywords:feature selection;robust;graph Laplacian;feature self-representation;low-rank constraint;unsupervised, L2-norm;dimension reduction 随着互联网的迅猛发展，产生了大量的高维特征选择方法包括有监督特征选择川、半监数据，高维数据通常含有大量的冗余、噪声，会降督特征选择)、无监督特征选择B。有监督特征低学习算法的性能。基于这一问题，特征选择和选择和半监督特征选择依赖于数据的标签信息，特征提取成为了降维的重要手段。特征提取通过然而在实际应用中，类别标注的成本过高。因将高维数据投影到低维子空间来减少数据的维此，这就要求应用无监督特征选择方法选择更具度，特征选择是直接选择原始数据的特征子集作价值的特征。常见的无监督特征选择方法可以分为了三类：过滤法)、包裹法和嵌入法。过滤为降维后的特征。本文，从特征选择的角度展开式特征选择方法独立于具体的学习算法，运算效研究。率较高。它的主要思想是对每一维特征赋予权收稿日期：2020-12-28.网络出版日期：2021-06-22 基金项目：江苏省研究生科研与实践创新计划项目(NKYI9074). 重，所得到的权重就代表着该特征的重要性，然通信作者：陈秀宏.E-mail:xiuhongc(@jiangnan..edu.cn 后依据权重进行排序，把重要性相对较小的特征DOI: 10.11992/tis.202012043 网络出版地址: https://kns.cnki.net/kcms/detail/23.1538.TP.20210622.0900.002.html 特征自表达和图正则化的鲁棒无监督特征选择陈彤，陈秀宏（江南大学人工智能与计算机学院，江苏无锡 214122） L2,1 摘要：为了在揭示数据全局结构的同时保留其局部结构，本文将特征自表达和图正则化统一到同一框架中，给出了一种新的无监督特征选择 (unsupervised feature selection，UFS) 模型与方法。模型使用特征自表达,用其余特征线性表示每一个特征，以保持特征的局部结构；用基于范数的图正则化项，在保留数据的局部几何结构的同时可以降低噪声数据对特征选择的影响；除此之外，在权重矩阵上施加了低秩约束，保留数据的全局结构。在 6 个不同的公开数据集上的实验表明，所给算法明显优于其他 5 个对比算法，表明了所提出的 UFS 框架的有效性。关键词：特征选择；鲁棒；图拉普拉斯；特征自表达；低秩约束；无监督； L2,1 范数；降维中图分类号：TP181 文献标志码：A 文章编号：1673−4785(2022)02−0286−09 中文引用格式：陈彤, 陈秀宏. 特征自表达和图正则化的鲁棒无监督特征选择 [J]. 智能系统学报, 2022, 17(2): 286–294. 英文引用格式：CHEN Tong, CHEN Xiuhong. Feature self-representation and graph regularization for robust unsupervised feature selection[J]. CAAI transactions on intelligent systems, 2022, 17(2): 286–294. Feature self-representation and graph regularization for robust unsupervised feature selection CHEN Tong，CHEN Xiuhong (School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China) L2;1 Abstract: In order to reveal the global structure of data and preserve its local structure, this paper proposes a new unsupervised feature selection (UFS) method, which puts feature self-representation and graph regularization into the same framework. Specifically, the model uses the self-representation of the features to represent each feature through other features for preserving the local structure of the features. An -norm based graph regularization term is used to reduce the effect of noisy data on feature selection while preserving the local geometric structure. Furthermore, the model uses a low-rank constraint on the weight matrix to preserve the global structure. Experiments on six different public datasets show that the algorithm is clearly superior to the other five algorithms, which demonstrates the effectiveness of the proposed UFS framework. L2,1 Keywords: feature selection; robust; graph Laplacian; feature self-representation; low-rank constraint; unsupervised; -norm; dimension reduction 随着互联网的迅猛发展，产生了大量的高维数据，高维数据通常含有大量的冗余、噪声，会降低学习算法的性能。基于这一问题，特征选择和特征提取成为了降维的重要手段。特征提取通过将高维数据投影到低维子空间来减少数据的维度，特征选择是直接选择原始数据的特征子集作为降维后的特征。本文，从特征选择的角度展开研究。特征选择方法包括有监督特征选择[1] 、半监督特征选择[2] 、无监督特征选择[3-4]。有监督特征选择和半监督特征选择依赖于数据的标签信息，然而在实际应用中，类别标注的成本过高。因此，这就要求应用无监督特征选择方法选择更具价值的特征。常见的无监督特征选择方法可以分为了三类：过滤法[3] 、包裹法[5] 和嵌入法[6]。过滤式特征选择方法独立于具体的学习算法，运算效率较高。它的主要思想是对每一维特征赋予权重，所得到的权重就代表着该特征的重要性，然后依据权重进行排序，把重要性相对较小的特征收稿日期：2020−12−28. 网络出版日期：2021−06−22. 基金项目：江苏省研究生科研与实践创新计划项目 (JNKY19_074). 通信作者：陈秀宏. E-mail: xiuhongc@jiangnan.edu.cn. 第 17 卷第 2 期智能系统学报 Vol.17 No.2 2022 年 3 月 CAAI Transactions on Intelligent Systems Mar. 2022

向下翻页>>

点击下载：【机器学习】特征自表达和图正则化的鲁棒无监督特征选择