正在加载图片...
858 北京科技大学学报 第29卷 表1不同测试样本集下的ASE 进行增量学习的同时,不需要重新训练所有样本,只 Table 1 ASE of different test sample sets 是根据聚类子集里的不同权重的支持向量重新建立 基于聚类的支持向量机增量学习算法 SVR 预估模型,实验结果表明:这种学习算法在精度和 训练样本集 ASE 训练样本集ASE 支持向量个数上要优于SVR:利用本文提出的算法 S1[388] 0.009831 进行钢材力学性能预报建模,取得了十分有效的应 Sz[148] 0.009870 用结果,今后需要进一步研究如何在样本聚类时消 S3133] 0.010077 1000 0.09928 去含有噪声的样本,使增量学习在实际应用中发挥 s[64] 0.010149 更大的作用等问题 S5[147] 0.010000 s6[120] 0.010100 参考文献 50 0.010087 吧 0.09775 [1]Vapnik N.The Nature of Statistical Learning Theory.New 100 0.010030 100 0.10027 York:Springer Press.2000,16 [2]Vapnik V.Statistical Learning Theory.New York:Willey. 150 0.010057 150 0.12193 1998,21 200 0.010053 200 0.15225 [3]Fung G.Mangasarian OL.Incremental Support Vector Machine Classification//Proc of the Second SIAM International Conference 从表1可以看出,随着测试样本数量的增加,基 on Data Mining.Arlington.2002:247 于SVR的预报模型的渐进标准误差平方和变化越 [4]萧嵘,王继成,孙正兴,等.一种SVM增量学习算法一 来越大,而基于聚类的支持向量机增量学习的预报 1SVM.软件学报,2001,12(12):1818 模型的ASE变化不大,这是由于基于聚类的支持 [5]Ruping S.Incremental Learning with Support Vector Machines/ IEEE International Conference on Data Mining-San Jose,2001: 向量机增量学习的预报模型根据测试样本进行在线 641 增量学习,每收集到一个样本后对模型进行修正的 [6]Tseng L Y,Yang B.A genetic clustering algorithm for data 缘故,随着时间推移,其预测精度要明显高于SVR with nonspherical shape clusters.Pattern Recognit.2000,33 模型,表现出了较好的泛化能力· (7):1251 [7]Wang Y J,Tian Q P.Metal Materials and Heat Treatment.Bei- 4 结论 jing:China Railway Publish Company.199960 [8]Chang CC.Lin C J.LIBSVM:a Library for Support Vector 本文研究了一种基于聚类的支持向量机增量学 Machines (Version 2.3)[EB/OL].[2001-06-08].http:// 习算法,随着时间推移每次在模型中增加一个样本 www.csie.ntu-edu.tw//cjlin/papers/libsvm-pof A sort of support vector machine incremental learning algorithm based on clustering WANG Ling,MU Zhichun,GUO Hui Information Engineering School.University of Science and Technology Beijing.Beijing 100083.China ABSTRACI A sort of incremental learning algorithm for support vector machine based on clustering was pro- posed.The nearest neighbor clustering algorithm was used for separating a whole training data set into several clusters,and each cluster subset was trained by support vector machine to obtain the support vector subset.The new sample data was firstly clustered in a certain subset.Then the distances between the new sample data and the support vectors of the cluster subset were calculated to weight every support vector.Finally,a new weighed model was formed with these samples.The proposed method was applied to a practical case of modeling predic- tion ability of the mechanical properties of steel materials.Comparing with the traditional support vector regres- sion algorithm,this proposed method demonstrates its advantages of the smaller number of support vectors and the better generalization capability. KEY WORDS support vector machine:support vector regression;clustering:incremental learning表1 不同测试样本集下的 ASE Table1 ASE of different test sample sets 基于聚类的支持向量机增量学习算法 SVR 训练样本集 ASE 训练样本集 ASE S1[388] 0∙009831 S2[148] 0∙009870 S3[133] 0∙010077 1000 0∙09928 S4[64] 0∙010149 S5[147] 0∙010000 S6[120] 0∙010100 50 0∙010087 50 0∙09775 100 0∙010030 100 0∙10027 150 0∙010057 150 0∙12193 200 0∙010053 200 0∙15225 从表1可以看出‚随着测试样本数量的增加‚基 于 SVR 的预报模型的渐进标准误差平方和变化越 来越大‚而基于聚类的支持向量机增量学习的预报 模型的 ASE 变化不大.这是由于基于聚类的支持 向量机增量学习的预报模型根据测试样本进行在线 增量学习‚每收集到一个样本后对模型进行修正的 缘故‚随着时间推移‚其预测精度要明显高于 SVR 模型‚表现出了较好的泛化能力. 4 结论 本文研究了一种基于聚类的支持向量机增量学 习算法‚随着时间推移每次在模型中增加一个样本 进行增量学习的同时‚不需要重新训练所有样本‚只 是根据聚类子集里的不同权重的支持向量重新建立 预估模型.实验结果表明:这种学习算法在精度和 支持向量个数上要优于 SVR;利用本文提出的算法 进行钢材力学性能预报建模‚取得了十分有效的应 用结果.今后需要进一步研究如何在样本聚类时消 去含有噪声的样本‚使增量学习在实际应用中发挥 更大的作用等问题. 参 考 文 献 [1] Vapnik N.The Nature of Statistical Learning Theory.New York:Springer Press‚2000:16 [2] Vapnik V.Statistical Learning Theory.New York:Willey‚ 1998:21 [3] Fung G‚Mangasarian O L.Incremental Support Vector Machine Classification∥Proc of the Second SIAM International Conference on Data Mining.Arlington‚2002:247 [4] 萧嵘‚王继成‚孙正兴‚等.一种 SVM 增量学习算法--- ISVM.软件学报‚2001‚12(12):1818 [5] Ruping S.Incremental Learning with Support Vector Machines∥ IEEE International Conference on Data Mining.San Jose‚2001: 641 [6] Tseng L Y‚Yang S B.A genetic clustering algorithm for data with nonspherical shape clusters.Pattern Recognit‚2000‚33 (7):1251 [7] Wang Y J‚Tian Q P.Metal Materials and Heat Treatment.Bei￾jing:China Railway Publish Company‚1999:60 [8] Chang C C‚Lin C J.LIBSVM:a Library for Support Vector Machines (Version 2.3) [ EB/OL ].[2001-06-08].http:∥ www.csie.ntu.edu.tw∥~cjlin/papers/libsvm.pdf A sort of support vector machine incremental learning algorithm based on clustering WA NG L ing‚MU Zhichun‚GUO Hui Information Engineering School‚University of Science and Technology Beijing‚Beijing100083‚China ABSTRACT A sort of incremental learning algorithm for support vector machine based on clustering was pro￾posed.The nearest neighbor clustering algorithm was used for separating a whole training data set into several clusters‚and each cluster subset was trained by support vector machine to obtain the support vector subset.The new sample data was firstly clustered in a certain subset.Then the distances between the new sample data and the support vectors of the cluster subset were calculated to weight every support vector.Finally‚a new weighed model was formed with these samples.The proposed method was applied to a practical case of modeling predic￾tion ability of the mechanical properties of steel materials.Comparing with the traditional support vector regres￾sion algorithm‚this proposed method demonstrates its advantages of the smaller number of support vectors and the better generalization capability. KEY WORDS support vector machine;support vector regression;clustering;incremental learning ·858· 北 京 科 技 大 学 学 报 第29卷
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有