正在加载图片...
第3期 胡小生,等:基于加权聚类质心的SVM不平衡分类方法 ·265. 显著降低实际参与模型训练样本数量的同时,能够 [8]AKBANI R,KWEK S,JAPKOWICZ N.Applying support 取得不低于其他采样方法的分类性能,为大规模不 vetor machines to imbalanced datasets[Cl//Proceedings of 平衡数据集分类问题提供了一种新的方法 15th European Conference on Machine Learning.Pisa,Ita- 由于数据集本身的多样性和复杂性,样本的分 1y,2004:39.50. [9]WU G.CHANG E Y.KBA:kernel boundary alignment 布也呈现多样性,如果能估计正负类样本潜在的分 considering imbalanced data distribution[].IEEE Transac- 布,根据不同的潜在分布设置不同的聚类方式,对算 tions on Knowledge and Data Engineering,2005,17(6): 法的分类性能将会提高更多 786-795. 参考文献: [10]ERTEKIN S,HUAN J,BOTTON L,et al.Learning on the border:active learning in imbalanced data classification [1]叶志飞,文益民,吕宝粮不平衡分类问题研究综述[J]. [C]//Proceedings of the ACM Conference on Information 智能系统学报,2009,4(2):148-156. and Knowledge Management.Lisbon,Portugal,2007: YE Zhifei,WEN Yimin,LO Baoliang.A survey of imbal- 127-136. anced pattern classification problems [J].CAAI Transac- [11]HAN Hui,WANG Wenyuan,MAO Binghuan.Borderline- tions on Intelligent Systems,2009,4(2):148-156. SMOTE:a new over-sampling method in imbalanced data [2]RONALDO C P,GUSTAVO E A,MARIA C M.A study sets leaming[C]//Proceedings of the International Confer- with class imbalance and random sampling for a decision ence on Intelligence Computing.Hefei,China,2005:878- tree learning system[C]//International Conference for In- 887. formation Processing.Milano,Italy,2008:131-140. [12]DASKALAKI S,KOPANAS L.Evaluation of classifiers for [3]WU Junjie,XIONG Hui,WU Peng,et al.Local decompo- an uneven class distribution problem[J].Applied Artificial sition for rare class analysis[C]//Proceedings of the 13th ntelligence,2006,20(5):381-417. ACM SIGKDD International Conference on Knowledge Dis- [13]LO Biaoliang.WANG Kaian,UTIYAMA M,et al.A part- covery and Data Mining.New York,USA:ACM,2007: versus-part method for massively parallel training of support 814-823. vector machines [C]//Proceedings of 17th International [4]HE Haibo,GARCIA E A.Learning from imbalanced data Joint Conference on Neural Networks.Budapest,Hungary, [J].IEEE Transactions on Knowledge and Data Engineer- 2004,1:735-740. ing,2009,21(9):1263-1284. 作者简介: [5]李雄飞,李军,董元方,等.一种新的不平衡数据学习算法 胡小生,男,1978年生,讲师,主要 PCBoot[J].计算机学报,2012,35(2):203-209. 研究方向为机器学习、数据挖掘、信息 LI Xiongfei,LI Jun,DONG Yuanfang,et al.A new learn- 检索 ing algorithm for imbalanced data-PCBoost [J].Chinese Journal of Computers,2012,35(2):203-209. [6]付忠良.不平衡多分类问题的连续AdaBoost算法研究 [J].计算机研究与发展,2011,48(2):2326-2333. FU Zhongliang.Real AdaBoost algorithm for multi-class and 钟勇.男,1970年生,教授,博士,主 imbalanced classification problems[J].Journal of Computer 要研究方向为信息检索、云计算 Research and Development,2011,48(2):2326-2333. 7]VEROPOULOS K,CAMPBELL C,CRISTIANINI N.Con- trolling the sensitivity of support vector machines[C]//Pro- ceedings of the International Joint Conference on Artificial Intelligence.San Francisco,USA,1999:55-60.显著降低实际参与模型训练样本数量的同时,能够 取得不低于其他采样方法的分类性能,为大规模不 平衡数据集分类问题提供了一种新的方法. 由于数据集本身的多样性和复杂性,样本的分 布也呈现多样性,如果能估计正负类样本潜在的分 布,根据不同的潜在分布设置不同的聚类方式,对算 法的分类性能将会提高更多. 参考文献: [1]叶志飞,文益民,吕宝粮.不平衡分类问题研究综述[ J]. 智能系统学报, 2009, 4(2): 148⁃156. YE Zhifei, WEN Yimin, LÜ Baoliang. A survey of imbal⁃ anced pattern classification problems [ J]. CAAI Transac⁃ tions on Intelligent Systems, 2009, 4(2):148⁃156. [2] RONALDO C P, GUSTAVO E A, MARIA C M. A study with class imbalance and random sampling for a decision tree learning system[ C] / / International Conference for In⁃ formation Processing. Milano, Italy, 2008: 131⁃140. [3]WU Junjie, XIONG Hui, WU Peng, et al. Local decompo⁃ sition for rare class analysis[ C] / / Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Dis⁃ covery and Data Mining. New York, USA: ACM, 2007: 814⁃823. [4]HE Haibo, GARCIA E A. Learning from imbalanced data [J]. IEEE Transactions on Knowledge and Data Engineer⁃ ing, 2009, 21(9): 1263⁃1284. [5]李雄飞,李军,董元方,等.一种新的不平衡数据学习算法 PCBoot[J].计算机学报, 2012, 35(2): 203⁃209. LI Xiongfei, LI Jun, DONG Yuanfang, et al. A new learn⁃ ing algorithm for imbalanced data—PCBoost [ J]. Chinese Journal of Computers, 2012, 35(2): 203⁃209. [6]付忠良.不平衡多分类问题的连续 AdaBoost 算法研究 [J].计算机研究与发展, 2011, 48(2): 2326⁃2333. FU Zhongliang. Real AdaBoost algorithm for multi⁃class and imbalanced classification problems[J]. Journal of Computer Research and Development, 2011, 48(2): 2326⁃2333. [7]VEROPOULOS K, CAMPBELL C, CRISTIANINI N. Con⁃ trolling the sensitivity of support vector machines[C] / / Pro⁃ ceedings of the International Joint Conference on Artificial Intelligence. San Francisco, USA, 1999: 55⁃60. [8]AKBANI R, KWEK S, JAPKOWICZ N. Applying support vetor machines to imbalanced datasets[C] / / Proceedings of 15th European Conference on Machine Learning. Pisa, Ita⁃ ly, 2004: 39⁃50. [9] WU G, CHANG E Y. KBA: kernel boundary alignment considering imbalanced data distribution[J]. IEEE Transac⁃ tions on Knowledge and Data Engineering, 2005, 17( 6): 786⁃795. [10]ERTEKIN S, HUAN J, BOTTON L, et al. Learning on the border: active learning in imbalanced data classification [C] / / Proceedings of the ACM Conference on Information and Knowledge Management. Lisbon, Portugal, 2007: 127⁃136. [11]HAN Hui, WANG Wenyuan, MAO Binghuan. Borderline⁃ SMOTE: a new over⁃sampling method in imbalanced data sets learning[C] / / Proceedings of the International Confer⁃ ence on Intelligence Computing. Hefei, China, 2005: 878⁃ 887. [12]DASKALAKI S, KOPANAS L. Evaluation of classifiers for an uneven class distribution problem[J]. Applied Artificial Intelligence, 2006, 20(5): 381⁃417. [13]LÜ Biaoliang, WANG Kaian, UTIYAMA M, et al. A part⁃ versus⁃part method for massively parallel training of support vector machines [ C] / / Proceedings of 17th International Joint Conference on Neural Networks. Budapest, Hungary, 2004, 1: 735⁃740. 作者简介: 胡小生,男,1978 年生,讲师,主要 研究方向为机器学习、数据挖掘、信息 检索. 钟勇,男,1970 年生,教授,博士,主 要研究方向为信息检索、云计算. 第 3 期 胡小生,等:基于加权聚类质心的 SVM 不平衡分类方法 ·265·
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有