正在加载图片...
·1080· 智能系统学报 第14卷 的距离度量与SMOTE融合是处理分类型不平衡 aptive synthetic sampling approach for imbalanced learn- 数据的另一个流行方法,因此,合理考虑这类问 ing[C]//Proceedings of 2008 IEEE International Joint Con- 题的本质特性,探索有效的距离度量方法是目前 ference on Neural Networks.Hong Kong,China,2008: 另一个研究热点。 1322-1328. [9]ZHU Tuanfai,LIN Yaping,LIU Yonghe.Synthetic minor- 5结束语 ity oversampling technique for multiclass imbalance prob- lems[J].Pattern recognition,2017,72:327-340. SMOTE过采样解决了随机过采样的过拟合 [10]DOUZAS G.BACAO F.Geometric SMOTE a geomet- 问题,是数据层面流行的预处理技术。本文主要 rically enhanced drop-in replacement for SMOTE[J].In- 阐述了SMOTE过采样的研究现状与工作原理, formation sciences,2019,501:118-135. 针对SMOTE存在的问题,对一些改进的SMOTE [11]SEIFFERT C.KHOSHGOFTAAR T M,VAN HULSE J. 算法进行了综述,同时概述了不同应用背景下关 Hybrid sampling for imbalanced data[J].Integrated com- 于SMOTE算法的研究工作,最后分析了SMOTE puter-aided engineering,2009,16(3):193-210. 算法在处理不平衡大数据、不平衡流数据、少量 [12]GAZZAH S,HECHKEL A,AMARA N E B.A hybrid 标签的不平衡数据等数据时需要进一步探索和研 sampling method for imbalanced data[C]//Proceedings of 究的问题。本文可为SMOTE的研究和应用提供 2015 IEEE 12th International Multi-Conference on Sys- 有价值的借鉴和参考。 tems,Signals Devices.Mahdia,Tunisia,2015:1-6. [13]古平,欧阳源遊.基于混合采样的非平衡数据集分类研 参考文献: 究).计算机应用研究,2015,32(2)379-381,418. [1]VASIGHIZAKER A,JALILI S.C-PUGP:a cluster-based GU Ping,OUYANG Yuanyou.Classification research for positive unlabeled learning method for disease gene predic- unbalanced data based on mixed-sampling[J].Applica- tion and prioritization[J].Computational biology and tion research of computers,2015,32(2):379-381,418. chemistry,2018,76:23-31. [14]SONG Jia,HUANG Xianglin,QIN Sijun,et al.A bi-dir- [2]JURGOVSKY J.GRANITZER M.ZIEGLER K.et al.Se- ectional sampling based on k-means method for imbal- quence classification for credit-card fraud detection[J].Ex- ance text classification[C]//Proceedings of 2016 pert systems with applications,2018.100:234-245. IEEE/ACIS International Conference on Computer and [3]KIM JH.Time frequency image and artificial neural net- Information Science.Okayama,Japan,2016:1-5. work based classification of impact noise for machine fault [15]冯宏伟,姚博,高原,等.基于边界混合采样的非均衡数 diagnosis[J].International journal of precision engineering 据处理算法[J.控制与决策,2017,32(10):1831-1836. and manufacturing,2018,19(6):821-827. FENG Hongwei,YAO Bo,GAO Yuan,et al.Imbalanced [4]CHAWLA N V,BOWYER K W,HALL L O,et al. data processing algorithm based on boundary mixed SMOTE:synthetic minority over-sampling technique[J]. sampling[J].Control and decision,2017,32(10): Journal of artificial intelligence research,2002,16(1): 1831-1836 321-357 [16]赵自翔,王广亮,李晓东.基于支持向量机的不平衡数 [5]FERNANDEZ A,GARCIA S,HERRERA F,et al. 据分类的改进欠采样方法).中山大学学报(自然科学 SMOTE for learning from imbalanced data:Progress and 版),2012,51(6:10-16. challenges,marking the 15-year anniversary[J].Journal of ZHAO Zixiang,WANG Guangliang,LI Xiaodong.An artificial intelligence research,2018,61:863-905. improved SVM based under-sampling method for classi- [6]HAN Hui.WANG Wenyuan.MAO Binghuan.Borderline- fying imbalanced data[J].Acta Scientiarum Naturalium SMOTE:a new over-sampling method in imbalanced data Universitatis Sunyatseni,2012,51(6):10-16. sets learning[C]//Proceedings of International Conference [17]JIA Cangzhi,ZUO Yun.S-SulfPred:a sensitive predictor on Intelligent Computing.Hefei,China,2005:878-887 to capture S-sulfenylation sites based on a resampling [7]BUNKHUMPORNPAT C,SINAPIROMSARAN K, one-sided selection undersampling-synthetic minority LURSINSAP C.Safe-level-SMOTE:safe-level-synthetic oversampling technique[J].Journal of theoretical biology minority over-sampling TEchnique for handling the class 2017,422:84-49. imbalanced problem[C]//Proceedings of the 13th Pacific- [18]HANSKUNATAI A.A new hybrid sampling approach Asia Conference on Knowledge Discovery and Data Min- for classification of imbalanced datasets[Cl//Proceedings ing.Bangkok,Thailand,2009:475-482. of 2018 International Conference on Computer and Com- [8]HE Haibo,BAI Yang,GARCIA E A,et al.ADASYN:ad- munication Systems.Nagoya,Japan,2018:67-71.的距离度量与 SMOTE 融合是处理分类型不平衡 数据的另一个流行方法,因此,合理考虑这类问 题的本质特性,探索有效的距离度量方法是目前 另一个研究热点。 5 结束语 SMOTE 过采样解决了随机过采样的过拟合 问题,是数据层面流行的预处理技术。本文主要 阐述了 SMOTE 过采样的研究现状与工作原理, 针对 SMOTE 存在的问题,对一些改进的 SMOTE 算法进行了综述,同时概述了不同应用背景下关 于 SMOTE 算法的研究工作,最后分析了 SMOTE 算法在处理不平衡大数据、不平衡流数据、少量 标签的不平衡数据等数据时需要进一步探索和研 究的问题。本文可为 SMOTE 的研究和应用提供 有价值的借鉴和参考。 参考文献: VASIGHIZAKER A, JALILI S. C-PUGP: a cluster-based positive unlabeled learning method for disease gene predic￾tion and prioritization[J]. Computational biology and chemistry, 2018, 76: 23–31. [1] JURGOVSKY J, GRANITZER M, ZIEGLER K, et al. Se￾quence classification for credit-card fraud detection[J]. Ex￾pert systems with applications, 2018, 100: 234–245. [2] KIM J H. Time frequency image and artificial neural net￾work based classification of impact noise for machine fault diagnosis[J]. International journal of precision engineering and manufacturing, 2018, 19(6): 821–827. [3] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of artificial intelligence research, 2002, 16(1): 321–357. [4] FERNÁNDEZ A, GARCIA S, HERRERA F, et al. SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary[J]. Journal of artificial intelligence research, 2018, 61: 863–905. [5] HAN Hui, WANG Wenyuan, MAO Binghuan. Borderline￾SMOTE: a new over-sampling method in imbalanced data sets learning[C]//Proceedings of International Conference on Intelligent Computing. Hefei, China, 2005: 878−887. [6] BUNKHUMPORNPAT C, SINAPIROMSARAN K, LURSINSAP C. Safe-level-SMOTE: safe-level-synthetic minority over-sampling TEchnique for handling the class imbalanced problem[C]//Proceedings of the 13th Pacific￾Asia Conference on Knowledge Discovery and Data Min￾ing. Bangkok, Thailand, 2009: 475−482. [7] [8] HE Haibo, BAI Yang, GARCIA E A, et al. ADASYN: ad￾aptive synthetic sampling approach for imbalanced learn￾ing[C]//Proceedings of 2008 IEEE International Joint Con￾ference on Neural Networks. Hong Kong, China, 2008: 1322−1328. ZHU Tuanfai, LIN Yaping, LIU Yonghe. Synthetic minor￾ity oversampling technique for multiclass imbalance prob￾lems[J]. Pattern recognition, 2017, 72: 327–340. [9] DOUZAS G, BACAO F. Geometric SMOTE a geomet￾rically enhanced drop-in replacement for SMOTE[J]. In￾formation sciences, 2019, 501: 118–135. [10] SEIFFERT C, KHOSHGOFTAAR T M, VAN HULSE J. Hybrid sampling for imbalanced data[J]. Integrated com￾puter-aided engineering, 2009, 16(3): 193–210. [11] GAZZAH S, HECHKEL A, AMARA N E B. A hybrid sampling method for imbalanced data[C]//Proceedings of 2015 IEEE 12th International Multi-Conference on Sys￾tems, Signals & Devices. Mahdia, Tunisia, 2015: 1−6. [12] 古平, 欧阳源遊. 基于混合采样的非平衡数据集分类研 究 [J]. 计算机应用研究, 2015, 32(2): 379–381, 418. GU Ping, OUYANG Yuanyou. Classification research for unbalanced data based on mixed-sampling[J]. Applica￾tion research of computers, 2015, 32(2): 379–381, 418. [13] SONG Jia, HUANG Xianglin, QIN Sijun, et al. A bi-dir￾ectional sampling based on k-means method for imbal￾ance text classification[C]//Proceedings of 2016 IEEE/ACIS International Conference on Computer and Information Science. Okayama, Japan, 2016: 1−5. [14] 冯宏伟, 姚博, 高原, 等. 基于边界混合采样的非均衡数 据处理算法 [J]. 控制与决策, 2017, 32(10): 1831–1836. FENG Hongwei, YAO Bo, GAO Yuan, et al. Imbalanced data processing algorithm based on boundary mixed sampling[J]. Control and decision, 2017, 32(10): 1831–1836. [15] 赵自翔, 王广亮, 李晓东. 基于支持向量机的不平衡数 据分类的改进欠采样方法 [J]. 中山大学学报(自然科学 版), 2012, 51(6): 10–16. ZHAO Zixiang, WANG Guangliang, LI Xiaodong. An improved SVM based under-sampling method for classi￾fying imbalanced data[J]. Acta Scientiarum Naturalium Universitatis Sunyatseni, 2012, 51(6): 10–16. [16] JIA Cangzhi, ZUO Yun. S-SulfPred: a sensitive predictor to capture S-sulfenylation sites based on a resampling one-sided selection undersampling-synthetic minority oversampling technique[J]. Journal of theoretical biology, 2017, 422: 84–49. [17] HANSKUNATAI A. A new hybrid sampling approach for classification of imbalanced datasets[C]//Proceedings of 2018 International Conference on Computer and Com￾munication Systems. Nagoya, Japan, 2018: 67−71. [18] ·1080· 智 能 系 统 学 报 第 14 卷
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有