Support Vector Machines Note to other teachers and users of Andrew w moore these slides. Andrew would be delighted if you found this source material useful in Professor giving your own lectures. Feel free to use ms p t moity them School of Computer Science of a significant portion of these sides in Carnegie Mellon University your own lecture please include this message, or the following link to the www.cs.cmu.edu/awn source repository of Andrews tutorials http://www.cs.cmu.edu/wawm/tutorials 向@ Cs. cmu. edu omments and corrections gratefully 412-268-7599 Slides modified for Comp537, Spring, 2006, HKUst Copyright C 2001, 2003, Andrew W. Moore Nov 23rd, 2001
Copyright © 2001, 2003, Andrew W. Moore Nov 23rd, 2001 Support Vector Machines Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 412-268-7599 Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully received. Slides Modified for Comp537, Spring, 2006, HKUST
History SVM is a classifier derived from statistical learning theory by vapnik and Chervonenkis SVMs introduced by Boser, guyon anik in COLT-92 Initially popularized in the nips community, now an important and active field of all Machine learning research Special issues of Machine Learning Journal, and Journal of Machine Learning Research Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 2
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 2 History • SVM is a classifier derived from statistical learning theory by Vapnik and Chervonenkis • SVMs introduced by Boser, Guyon, Vapnik in COLT-92 • Initially popularized in the NIPS community, now an important and active field of all Machine Learning research. • Special issues of Machine Learning Journal, and Journal of Machine Learning Research
Roadmap Hard-Margin linear classifier Maximize Margin · Support vector Quadratic Programming Soft-Margin linear classifier Maximize Margin Support vector Quadratic Programming Non-Linear separable problem ●XOR Transform to Non-Linear by kernels Reference Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 3
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 3 Roadmap • Hard-Margin Linear Classifier • Maximize Margin • Support Vector • Quadratic Programming • Soft-Margin Linear Classifier • Maximize Margin • Support Vector • Quadratic Programming • Non-Linear Separable Problem • XOR • Transform to Non-Linear by Kernels • Reference
Linear classifiers X f est f( w, b=sign ( w x-b denotes +1 denotes -1 How would you classify this data? Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 4
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 4 Linear Classifiers f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?
Linear classifiers X f est fx w, b)=sign (w, x-b denotes +1 denotes -1 How would you classify this data? Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 5
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 5 Linear Classifiers f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?
Linear classifiers X f est f( w, b= sign (w x-b denotes +1 denotes -1 How would you classify this data? Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 6
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 6 Linear Classifiers f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?
Linear classifiers X f est f( w, b=sign ( w x-b denotes +1 denotes -1 How would you classify this data? Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 7
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 7 Linear Classifiers f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) How would you classify this data?
Linear classifiers X f est f y, b)=sign(w, X-b denotes +1 denotes -1 Any of these would be fine but which is best? Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 8
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 8 Linear Classifiers f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) Any of these would be fine.. ..but which is best?
Classifier Margin f est f( w, b=sign ( w x-b denotes +1 denotes -1 Define the margin of a linear classifier as the Width that the boundary could be increased by before hitting datapoint Copyright o 2001, 2003, Andrew W.Modre Support Vector Machines: Slide 9
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 9 Classifier Margin f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) Define the margin of a linear classifier as the width that the boundary could be increased by before hitting a datapoint
Maximum Margin f est f( w, b=sign w x-b) denotes +1 denotes -1 The maximum margin linear classifier is the linear classifier With the, um maximum margin This is the simplest kind of SVM(Called an SVM Linear sⅥM Copyright 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 10
Copyright © 2001, 2003, Andrew W. Moore Support Vector Machines: Slide 10 Maximum Margin f x a y est denotes +1 denotes -1 f(x,w,b) = sign(w. x - b) The maximum margin linear classifier is the linear classifier with the, um, maximum margin. This is the simplest kind of SVM (Called an LSVM) Linear SVM