正在加载图片...
Support Matrix Machines LuoLuo RICKY @SJTU.EDU.CN Yubo Xie YUBOTSE@SJTU.EDU.CN Department of Computer Science and Engineering.Shanghai Jiao Tong University,Shanghai.China Zhihua Zhang ZHIHUA @SJTU.EDU.CN Institute of Data Science,Department of Computer Science and Engineering,Shanghai Jiao Tong University,China Wu-Jun Li LIWUJUN@NJU.EDU.CN National Key Laboratory for Novel Software Technology,Collaborative Innovation Center of Novel Software Technology and Industrialization,Department of Computer Science and Technology,Nanjing University,China Abstract the case that input samples are represented as vectors or s- In many classification problems such as elec- calars.However,it is also often met that input samples are troencephalogram(EEG)classification and im- naturally represented as two-dimensional matrices or ten- age classification,the input features are naturally sors.When using classical classification methods to data of represented as matrices rather than vectors or s- matrix representation,we have to reshape the input matri- calars.In general,the structure information of ces into vectors.However,this would destroy the structure the original feature matrix is useful and informa- information of the data matrix,e.g.,the correlation of dif- tive for data analysis tasks such as classification. ferent channels of electroencephalogram(EEG)data(Zhou One typical structure information is the correla- Li,2014)and the spatial relationship of the nearby pixels tion between columns or rows in the feature ma- of image data (Wolf et al.,2007).Moreover,if the data ma- trix.To leverage this kind of structure informa- trix is stacked (reshaped)into a vector,the dimensionality tion,we propose a new classification method that of the resulting vector typically becomes very high,which we call support matrix machine(SMM).Specif- in turn leads to the curse of dimensionality. ically,SMM is defined as a hinge loss plus a There has been some work on classification methods which so-called spectral elastic net penalty which is a attempt to exploit the correlation between the columns or spectral extension of the conventional elastic net rows of the data matrix.Usually,such a classification over a matrix.The spectral elastic net enjoys a method introduces a matrix of regression coefficients to property of grouping effect,i.e.,strongly corre- leverage the correlation within the data matrix.For ex- lated columns or rows tend to be selected alto- ample,Wolf et al.(2007)proposed a rank-k SVM,which gether or not.Since the optimization problem models the regression matrix as the sum of k rank-one or- for SMM is convex,this encourages us to de- thogonal matrices.Pirsiavash et al.(2009)devised a bilin- vise an alternating direction method of multipli- ear SVM by factorizing the regression matrix into two low- ers(ADMM)algorithm for solving the problem. rank matrices.Cai et al.(2006)proposed a similar bilinear Experimental results on EEG and image classifi- framework called support tensor machines for text catego- cation data show that our model is more robust rization.These methods essentially take advantage of the and efficient than the state-of-the-art methods. low-rank assumption,which can be used for describing the correlation within a matrix.However,their treatments re- sult in non-convex optimization problems. 1.Introduction In this paper we are also concerned with the classification Classical classification methods such as support vector ma- problems on a set of data matrices.Our work is motivat- chines (SVMs)(Cortes Vapnik.1995)and logistic re- ed by the use of the nuclear norm (a.k.a.,the trace nor- gression (Hastie et al.,2001)have been originally built on m)in low-rank matrix approximation (Srebro Shraib- Proceedings of the 32n4 International Conference on Machine man,2005),matrix completion(Candes Recht,2009:Li- Learning,Lille,France,2015.JMLR:W&CP volume 37.Copy- u et al.,2013;Salakhutdinov Srebro,2010;Huang et al., right 2015 by the author(s). 2013),and multi-task learning problems(Pong et al.,2010;Support Matrix Machines Luo Luo RICKY@SJTU.EDU.CN Yubo Xie YUBOTSE@SJTU.EDU.CN Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Zhihua Zhang ZHIHUA@SJTU.EDU.CN Institute of Data Science, Department of Computer Science and Engineering, Shanghai Jiao Tong University, China Wu-Jun Li LIWUJUN@NJU.EDU.CN National Key Laboratory for Novel Software Technology, Collaborative Innovation Center of Novel Software Technology and Industrialization, Department of Computer Science and Technology, Nanjing University, China Abstract In many classification problems such as elec￾troencephalogram (EEG) classification and im￾age classification, the input features are naturally represented as matrices rather than vectors or s￾calars. In general, the structure information of the original feature matrix is useful and informa￾tive for data analysis tasks such as classification. One typical structure information is the correla￾tion between columns or rows in the feature ma￾trix. To leverage this kind of structure informa￾tion, we propose a new classification method that we call support matrix machine (SMM). Specif￾ically, SMM is defined as a hinge loss plus a so-called spectral elastic net penalty which is a spectral extension of the conventional elastic net over a matrix. The spectral elastic net enjoys a property of grouping effect, i.e., strongly corre￾lated columns or rows tend to be selected alto￾gether or not. Since the optimization problem for SMM is convex, this encourages us to de￾vise an alternating direction method of multipli￾ers (ADMM) algorithm for solving the problem. Experimental results on EEG and image classifi- cation data show that our model is more robust and efficient than the state-of-the-art methods. 1. Introduction Classical classification methods such as support vector ma￾chines (SVMs) (Cortes & Vapnik, 1995) and logistic re￾gression (Hastie et al., 2001) have been originally built on Proceedings of the 32 nd International Conference on Machine Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copy￾right 2015 by the author(s). the case that input samples are represented as vectors or s￾calars. However, it is also often met that input samples are naturally represented as two-dimensional matrices or ten￾sors. When using classical classification methods to data of matrix representation, we have to reshape the input matri￾ces into vectors. However, this would destroy the structure information of the data matrix, e.g., the correlation of dif￾ferent channels of electroencephalogram (EEG) data (Zhou & Li, 2014) and the spatial relationship of the nearby pixels of image data (Wolf et al., 2007). Moreover, if the data ma￾trix is stacked (reshaped) into a vector, the dimensionality of the resulting vector typically becomes very high, which in turn leads to the curse of dimensionality. There has been some work on classification methods which attempt to exploit the correlation between the columns or rows of the data matrix. Usually, such a classification method introduces a matrix of regression coefficients to leverage the correlation within the data matrix. For ex￾ample, Wolf et al. (2007) proposed a rank-k SVM, which models the regression matrix as the sum of k rank-one or￾thogonal matrices. Pirsiavash et al. (2009) devised a bilin￾ear SVM by factorizing the regression matrix into two low￾rank matrices. Cai et al. (2006) proposed a similar bilinear framework called support tensor machines for text catego￾rization. These methods essentially take advantage of the low-rank assumption, which can be used for describing the correlation within a matrix. However, their treatments re￾sult in non-convex optimization problems. In this paper we are also concerned with the classification problems on a set of data matrices. Our work is motivat￾ed by the use of the nuclear norm (a.k.a., the trace nor￾m) in low-rank matrix approximation (Srebro & Shraib￾man, 2005), matrix completion (Candes & Recht ` , 2009; Li￾u et al., 2013; Salakhutdinov & Srebro, 2010; Huang et al., 2013), and multi-task learning problems (Pong et al., 2010;
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有