表1 不同测试样本集下的 ASE Table1 ASE of differ

正在加载图片...

858 北京科技大学学报第29卷表1不同测试样本集下的ASE 进行增量学习的同时，不需要重新训练所有样本，只 Table 1 ASE of different test sample sets 是根据聚类子集里的不同权重的支持向量重新建立基于聚类的支持向量机增量学习算法 SVR 预估模型，实验结果表明：这种学习算法在精度和训练样本集 ASE 训练样本集ASE 支持向量个数上要优于SVR:利用本文提出的算法 S1[388] 0.009831 进行钢材力学性能预报建模，取得了十分有效的应 Sz[148] 0.009870 用结果，今后需要进一步研究如何在样本聚类时消 S3133] 0.010077 1000 0.09928 去含有噪声的样本，使增量学习在实际应用中发挥 s[64] 0.010149 更大的作用等问题 S5[147] 0.010000 s6[120] 0.010100 参考文献 50 0.010087 吧 0.09775 [1]Vapnik N.The Nature of Statistical Learning Theory.New 100 0.010030 100 0.10027 York:Springer Press.2000,16 [2]Vapnik V.Statistical Learning Theory.New York:Willey. 150 0.010057 150 0.12193 1998,21 200 0.010053 200 0.15225 [3]Fung G.Mangasarian OL.Incremental Support Vector Machine Classification//Proc of the Second SIAM International Conference 从表1可以看出，随着测试样本数量的增加，基 on Data Mining.Arlington.2002:247 于SVR的预报模型的渐进标准误差平方和变化越 [4]萧嵘，王继成，孙正兴，等.一种SVM增量学习算法一来越大，而基于聚类的支持向量机增量学习的预报 1SVM.软件学报，2001,12(12)：1818 模型的ASE变化不大，这是由于基于聚类的支持 [5]Ruping S.Incremental Learning with Support Vector Machines/ IEEE International Conference on Data Mining-San Jose,2001: 向量机增量学习的预报模型根据测试样本进行在线 641 增量学习，每收集到一个样本后对模型进行修正的 [6]Tseng L Y,Yang B.A genetic clustering algorithm for data 缘故，随着时间推移，其预测精度要明显高于SVR with nonspherical shape clusters.Pattern Recognit.2000,33 模型，表现出了较好的泛化能力· (7):1251 [7]Wang Y J,Tian Q P.Metal Materials and Heat Treatment.Bei- 4 结论 jing:China Railway Publish Company.199960 [8]Chang CC.Lin C J.LIBSVM:a Library for Support Vector 本文研究了一种基于聚类的支持向量机增量学 Machines (Version 2.3)[EB/OL].[2001-06-08].http:// 习算法，随着时间推移每次在模型中增加一个样本 www.csie.ntu-edu.tw//cjlin/papers/libsvm-pof A sort of support vector machine incremental learning algorithm based on clustering WANG Ling,MU Zhichun,GUO Hui Information Engineering School.University of Science and Technology Beijing.Beijing 100083.China ABSTRACI A sort of incremental learning algorithm for support vector machine based on clustering was pro- posed.The nearest neighbor clustering algorithm was used for separating a whole training data set into several clusters,and each cluster subset was trained by support vector machine to obtain the support vector subset.The new sample data was firstly clustered in a certain subset.Then the distances between the new sample data and the support vectors of the cluster subset were calculated to weight every support vector.Finally,a new weighed model was formed with these samples.The proposed method was applied to a practical case of modeling predic- tion ability of the mechanical properties of steel materials.Comparing with the traditional support vector regres- sion algorithm,this proposed method demonstrates its advantages of the smaller number of support vectors and the better generalization capability. KEY WORDS support vector machine:support vector regression;clustering:incremental learning表1 不同测试样本集下的 ASE Table1 ASE of different test sample sets 基于聚类的支持向量机增量学习算法 SVR 训练样本集 ASE 训练样本集 ASE S1［388］ 0∙009831 S2［148］ 0∙009870 S3［133］ 0∙010077 1000 0∙09928 S4［64］ 0∙010149 S5［147］ 0∙010000 S6［120］ 0∙010100 50 0∙010087 50 0∙09775 100 0∙010030 100 0∙10027 150 0∙010057 150 0∙12193 200 0∙010053 200 0∙15225 从表1可以看出随着测试样本数量的增加基于 SVR 的预报模型的渐进标准误差平方和变化越来越大而基于聚类的支持向量机增量学习的预报模型的 ASE 变化不大．这是由于基于聚类的支持向量机增量学习的预报模型根据测试样本进行在线增量学习每收集到一个样本后对模型进行修正的缘故随着时间推移其预测精度要明显高于 SVR 模型表现出了较好的泛化能力． 4 结论本文研究了一种基于聚类的支持向量机增量学习算法随着时间推移每次在模型中增加一个样本进行增量学习的同时不需要重新训练所有样本只是根据聚类子集里的不同权重的支持向量重新建立预估模型．实验结果表明：这种学习算法在精度和支持向量个数上要优于 SVR；利用本文提出的算法进行钢材力学性能预报建模取得了十分有效的应用结果．今后需要进一步研究如何在样本聚类时消去含有噪声的样本使增量学习在实际应用中发挥更大的作用等问题．参考文献［1］ Vapnik N．The Nature of Statistical Learning Theory．New York：Springer Press2000：16 ［2］ Vapnik V．Statistical Learning Theory．New York：Willey 1998：21 ［3］ Fung GMangasarian O L．Incremental Support Vector Machine Classification∥Proc of the Second SIAM International Conference on Data Mining．Arlington2002：247 ［4］萧嵘王继成孙正兴等．一种 SVM 增量学习算法－－－ ISVM．软件学报200112（12）：1818 ［5］ Ruping S．Incremental Learning with Support Vector Machines∥ IEEE International Conference on Data Mining．San Jose2001： 641 ［6］ Tseng L YYang S B．A genetic clustering algorithm for data with nonspherical shape clusters．Pattern Recognit200033 （7）：1251 ［7］ Wang Y JTian Q P．Metal Materials and Heat Treatment．Beijing：China Railway Publish Company1999：60 ［8］ Chang C CLin C J．LIBSVM：a Library for Support Vector Machines （Version 2．3）［ EB／OL ］．［2001－06－08］．http：∥ www．csie．ntu．edu．tw∥～cjlin／papers／libsvm．pdf A sort of support vector machine incremental learning algorithm based on clustering WA NG L ingMU ZhichunGUO Hui Information Engineering SchoolUniversity of Science and Technology BeijingBeijing100083China ABSTRACT A sort of incremental learning algorithm for support vector machine based on clustering was proposed．The nearest neighbor clustering algorithm was used for separating a whole training data set into several clustersand each cluster subset was trained by support vector machine to obtain the support vector subset．The new sample data was firstly clustered in a certain subset．Then the distances between the new sample data and the support vectors of the cluster subset were calculated to weight every support vector．Finallya new weighed model was formed with these samples．The proposed method was applied to a practical case of modeling prediction ability of the mechanical properties of steel materials．Comparing with the traditional support vector regression algorithmthis proposed method demonstrates its advantages of the smaller number of support vectors and the better generalization capability． KEY WORDS support vector machine；support vector regression；clustering；incremental learning ·858· 北京科技大学学报第29卷

<<向上翻页

点击下载：一种基于聚类的支持向量机增量学习算法