正在加载图片...
Journal of Statistical Software 7 art results on most classification and regression problems(Meyer,Leisch,and Hornik 2003), SVMlight (Joachims 1999),SVMTorch(Collobert,Bengio,and Mariethoz 2002),Royal Hol- loway Support Vector Machines,(Gammerman,Bozanic,Scholkopf,Vovk,Vapnik,Bottou, Smola,Watkins,LeCun,Saunders,Stitson,and Weston 2001),mySVM(Ruiping 2004),and M-SVM(Guermeur 2004).Many packages provide interfaces to MATLAB (The Math Works 2005)(such as libsvm),and there are some native MATLAB toolboxes as well such as the SVM and Kernel Methods Matlab Toolbox (Canu,Grandvalet,and Rakotomamonjy 2003) or the MATLAB Support Vector Machine Toolbox (Gunn 1998)and the SVM toolbox for Matlab(Schwaighofer 2005) 2.6.R software overview The first implementation of SVM in R(R Development Core Team 2005)was introduced in the e1071(Dimitriadou,Hornik,Leisch,Meyer,and Weingessel 2005)package.The svm() function in e1071 provides a rigid interface to libsvm along with visualization and parameter tuning methods. Package kernlab features a variety of kernel-based methods and includes a SVM method based on the optimizers used in libsvm and bsvm(Hsu and Lin 2002c).It aims to provide a flexible and extensible SVM implementation. Package klaR (Roever,Raabe,Luebke,and Ligges 2005)includes an interface to SVMlight,a popular SVM implementation that additionally offers classification tools such as Regularized Discriminant Analysis. Finally,package svmpath(Hastie 2004)provides an algorithm that fits the entire path of the SVM solution (i.e.,for any value of the cost parameter). In the remainder of the paper we will extensively review and compare these four SVM imple- mentations. 3.Data Throughout the paper,we will use the following data sets accessible through R(see Table 1), most of them originating from the UCI machine learning database (Blake and Merz 1998): iris This famous (Fisher's or Anderson's)iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width,respectively,for 50 flowers from each of 3 species of iris.The species are Iris setosa,versicolor,and virginica.The data set is provided by base R. spam A data set collected at Hewlett-Packard Labs which classifies 4601 e-mails as spam or non-spam.In addition to this class label there are 57 variables indicating the frequency of certain words and characters in the e-mail.The data set is provided by the kernlab package. musk This dataset in package kernlab describes a set of 476 molecules of which 207 are judged by human experts to be musks and the remaining 269 molecules are judged to be non-musks.The data has 167 variables which describe the geometry of the molecules.Journal of Statistical Software 7 art results on most classification and regression problems (Meyer, Leisch, and Hornik 2003), SVMlight (Joachims 1999), SVMTorch (Collobert, Bengio, and Mari´ethoz 2002), Royal Hol￾loway Support Vector Machines, (Gammerman, Bozanic, Sch¨olkopf, Vovk, Vapnik, Bottou, Smola, Watkins, LeCun, Saunders, Stitson, and Weston 2001), mySVM (Ruping ¨ 2004), and M-SVM (Guermeur 2004). Many packages provide interfaces to MATLAB (The MathWorks 2005) (such as libsvm), and there are some native MATLAB toolboxes as well such as the SVM and Kernel Methods Matlab Toolbox (Canu, Grandvalet, and Rakotomamonjy 2003) or the MATLAB Support Vector Machine Toolbox (Gunn 1998) and the SVM toolbox for Matlab (Schwaighofer 2005) 2.6. R software overview The first implementation of SVM in R (R Development Core Team 2005) was introduced in the e1071 (Dimitriadou, Hornik, Leisch, Meyer, and Weingessel 2005) package. The svm() function in e1071 provides a rigid interface to libsvm along with visualization and parameter tuning methods. Package kernlab features a variety of kernel-based methods and includes a SVM method based on the optimizers used in libsvm and bsvm (Hsu and Lin 2002c). It aims to provide a flexible and extensible SVM implementation. Package klaR (Roever, Raabe, Luebke, and Ligges 2005) includes an interface to SVMlight, a popular SVM implementation that additionally offers classification tools such as Regularized Discriminant Analysis. Finally, package svmpath (Hastie 2004) provides an algorithm that fits the entire path of the SVM solution (i.e., for any value of the cost parameter). In the remainder of the paper we will extensively review and compare these four SVM imple￾mentations. 3. Data Throughout the paper, we will use the following data sets accessible through R (see Table 1), most of them originating from the UCI machine learning database (Blake and Merz 1998): iris This famous (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. The data set is provided by base R. spam A data set collected at Hewlett-Packard Labs which classifies 4601 e-mails as spam or non-spam. In addition to this class label there are 57 variables indicating the frequency of certain words and characters in the e-mail. The data set is provided by the kernlab package. musk This dataset in package kernlab describes a set of 476 molecules of which 207 are judged by human experts to be musks and the remaining 269 molecules are judged to be non-musks. The data has 167 variables which describe the geometry of the molecules
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有