正在加载图片...
第14卷第1期 智能系统学报 Vol.14 No.I 2019年1月 CAAI Transactions on Intelligent Systems Jan.2019 D0:10.11992/tis.201806021 网络出版地址:http:/kns.cnki.net/kcms/detail/23.1538.TP.20180927.1309.008.html 一种多样性和精度加权的数据流集成分类算法 张本才,王志海,孙艳歌2 (1.北京交通大学计算机与信息技术学院,北京100044,2.信阳师范学院计算机与信息技术学院,河南信阳 464000) 摘要:为了克服数据流中概念漂移对分类的影响,提出了一种基于多样性和精度加权的集成分类方法( versity and accuracy weighting ensemble classification algorithm,DAWE),该方法与已有的其他集成方法不同的地方 在于,DAWE同时考虑了多样性和精度这两种度量标准,将分类器在最新数据块上的精度及其在集成分类器中 的多样性进行线性加权,以此来衡量一个分类器对于当前集成分类器的价值,并将价值度量用于基分类器替换 策略。提出的DAWE算法与MOA中最新算法分别在真实数据和人工合成数据上进行了对比实验,实验表明. 提出的方法是有效的,在所有数据集上的平均精度优于其他算法,该方法能有效处理数据流挖掘中的概念漂移 问题。 关键词:数据流;概念漂移:多样性;精度;集成学习;数据块;价值度量;MOA 中图分类号:TP391文献标志码:A文章编号:1673-4785(2019)01-0179-07 中文引用格式:张本才,王志海,孙艳歌.一种多样性和精度加权的数据流集成分类算法.智能系统学报,2019,14(1): 179-185. 英文引用格式:ZHANG Bencai,,WANG Zhihai,.SUN Yan'ge.An ensemble classification algorithm based on diversity and accur. acy weighting for data streamsJ CAAI transactions on intelligent systems,2019,14(1):179-185. An ensemble classification algorithm based on diversity and accuracy weighting for data streams ZHANG Bencai',WANG Zhihai',SUN Yan'ge12 (1.School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China;2.School of Computer and Information Technology,Xinyang Normal University,Xinyang 464000,China) Abstract:To overcome the effect of concept drift on data stream classification,we propose an ensemble classification algorithm based on diversity and accuracy weighting named DAWE.The difference between DAWE and other existing ensemble methods is that DAWE considers both diversity and accuracy.The classifier's accuracy on the new data chunk and its diversity in the ensemble were linearly weighted to measure the value of the current ensemble classifier and the measured value was applied to the substitute strategy of the base classifier.The DAWE algorithm proposed in this pa- per was experimentally compared with the latest algorithms in massive online analysis(MOA),using both synthetic and real-world datasets.Experiments showed that the method proposed in this paper was effective and the average overall accuracy of the data sets was superior to that of other algorithms.Overall,this method can effectively manage concept drift in data stream mining. Keywords:data stream;concept drift;diversity;accuracy;ensemble learning;data chunk;value measurement;MOA 近年来,随着各种网络,比如社交网络、传感 速度源源不断地产生大量数据流。与此同时,如 器网络的不断发展,越来越多的应用在以极快的 何快速地从大量数据流中生成有用的模型或者提 收稿日期:2018-06-07.网络出版日期:2018-09-29 取有用信息吸引了大量研究者。 基金项目:国家自然科学基金项目(61672086,61702030,61771058): 北京市自然科学基金项目(4182052). 数据流分类是传统的有监督机器学习的一种 通信作者:王志海.E-mail:zhhwang@bjtu.edu.cn.. 变体,传统的有监督机器学习都是针对于由特征DOI: 10.11992/tis.201806021 网络出版地址: http://kns.cnki.net/kcms/detail/23.1538.TP.20180927.1309.008.html 一种多样性和精度加权的数据流集成分类算法 张本才1 ,王志海1 ,孙艳歌1,2 (1. 北京交通大学 计算机与信息技术学院,北京 100044; 2. 信阳师范学院 计算机与信息技术学院,河南 信阳 464000) 摘 要:为了克服数据流中概念漂移对分类的影响,提出了一种基于多样性和精度加权的集成分类方法 (di￾versity and accuracy weighting ensemble classification algorithm, DAWE),该方法与已有的其他集成方法不同的地方 在于,DAWE 同时考虑了多样性和精度这两种度量标准,将分类器在最新数据块上的精度及其在集成分类器中 的多样性进行线性加权,以此来衡量一个分类器对于当前集成分类器的价值,并将价值度量用于基分类器替换 策略。提出的 DAWE 算法与 MOA 中最新算法分别在真实数据和人工合成数据上进行了对比实验,实验表明, 提出的方法是有效的,在所有数据集上的平均精度优于其他算法,该方法能有效处理数据流挖掘中的概念漂移 问题。 关键词:数据流;概念漂移;多样性;精度;集成学习;数据块;价值度量;MOA 中图分类号:TP391 文献标志码:A 文章编号:1673−4785(2019)01−0179−07 中文引用格式:张本才, 王志海, 孙艳歌. 一种多样性和精度加权的数据流集成分类算法[J]. 智能系统学报, 2019, 14(1): 179–185. 英文引用格式:ZHANG Bencai, WANG Zhihai, SUN Yan’ge. An ensemble classification algorithm based on diversity and accur￾acy weighting for data streams[J]. CAAI transactions on intelligent systems, 2019, 14(1): 179–185. An ensemble classification algorithm based on diversity and accuracy weighting for data streams ZHANG Bencai1 ,WANG Zhihai1 ,SUN Yan’ge1,2 (1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; 2. School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China) Abstract: To overcome the effect of concept drift on data stream classification, we propose an ensemble classification algorithm based on diversity and accuracy weighting named DAWE. The difference between DAWE and other existing ensemble methods is that DAWE considers both diversity and accuracy. The classifier’s accuracy on the new data chunk and its diversity in the ensemble were linearly weighted to measure the value of the current ensemble classifier and the measured value was applied to the substitute strategy of the base classifier. The DAWE algorithm proposed in this pa￾per was experimentally compared with the latest algorithms in massive online analysis (MOA), using both synthetic and real-world datasets. Experiments showed that the method proposed in this paper was effective and the average overall accuracy of the data sets was superior to that of other algorithms. Overall, this method can effectively manage concept drift in data stream mining. Keywords: data stream; concept drift; diversity; accuracy; ensemble learning; data chunk; value measurement; MOA 近年来,随着各种网络,比如社交网络、传感 器网络的不断发展,越来越多的应用在以极快的 速度源源不断地产生大量数据流。与此同时,如 何快速地从大量数据流中生成有用的模型或者提 取有用信息吸引了大量研究者。 数据流分类是传统的有监督机器学习的一种 变体,传统的有监督机器学习都是针对于由特征 收稿日期:2018−06−07. 网络出版日期:2018−09−29. 基金项目:国家自然科学基金项目 (61672086, 61702030, 61771058); 北京市自然科学基金项目 (4182052). 通信作者:王志海. E-mail:zhhwang@bjtu.edu.cn. 第 14 卷第 1 期 智 能 系 统 学 报 Vol.14 No.1 2019 年 1 月 CAAI Transactions on Intelligent Systems Jan. 2019
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有