2 Zoya Gavrilov The reason for this l_中国高校课件下载中心

点击下载：《机器学习 Machine Learning》课程教学资源（书籍文献）SVM Tutorial

正在加载图片...

2 Zoya Gavrilov The reason for this labelling scheme is that it lets us condense the formula- tion for the decision function to f(x)=sign(wx+b)since f(x)=+1 for all x above the boundary,and f(x)=-1 for all x below the boundary. Thus,we can figure out if an instance has been classified properly by checking that y(wTx+b)>1 (which will be the case as long as either both y,wx+b>0 or else y,wTx+b<0). You'll notice that we will now have some space between our decision bound- ary and the nearest data points of either class.Thus,let's rescale the data such that anything on or above the boundary wx+b=1 is of one class (with label 1),and anything on or below the boundary wx+b=-1 is of the other class (with label -1). What is the distance between these newly added boundaries? First note that the two lines are parallel,and thus share their parameters w,b. Pick an arbirary point x1 to lie on line wTx+b=-1.Then,the closest point on line wTx+b=1 to x1 is the point x2 =x1+Aw (since the closest point will always lie on the perpendicular;recall that the vector w is perpendicular to both lines).Using this formulation,Aw will be the line segment connecting x1 and x2,and thus,llwll,the distance between xi and x2,is the shortest distance between the two lines/boundaries.Solving for A: →wTx2+b=1 where x2=x1+λw →wT(x1+λw)+b=1 →wTx1+b+λwTw=1 where wTx1+b=-1 →-1+λwrw=1 →λwTw=2 今入=w子=都 2 And so,the distancewisw It's intuitive that we would want to maximize the distance between the two2 Zoya Gavrilov The reason for this labelling scheme is that it lets us condense the formulation for the decision function to f(x) = sign(wT x + b) since f(x) = +1 for all x above the boundary, and f(x) = −1 for all x below the boundary. Thus, we can figure out if an instance has been classified properly by checking that y(wT x+b) ≥ 1 (which will be the case as long as either both y, wT x+b > 0 or else y, wT x + b < 0). You’ll notice that we will now have some space between our decision boundary and the nearest data points of either class. Thus, let’s rescale the data such that anything on or above the boundary wT x + b = 1 is of one class (with label 1), and anything on or below the boundary wT x + b = −1 is of the other class (with label −1). What is the distance between these newly added boundaries? First note that the two lines are parallel, and thus share their parameters w, b. Pick an arbirary point x1 to lie on line wT x + b = −1. Then, the closest point on line wT x + b = 1 to x1 is the point x2 = x1 + λw (since the closest point will always lie on the perpendicular; recall that the vector w is perpendicular to both lines). Using this formulation, λw will be the line segment connecting x1 and x2, and thus, λkwk, the distance between x1 and x2, is the shortest distance between the two lines/boundaries. Solving for λ: ⇒ wT x2 + b = 1 where x2 = x1 + λw ⇒ wT (x1 + λw) + b = 1 ⇒ wT x1 + b + λwT w = 1 where wT x1 + b = −1 ⇒ −1 + λwT w = 1 ⇒ λwT w = 2 ⇒ λ = 2 wT w = 2 kwk2 And so, the distance λkwk is 2 kwk2 kwk = 2 kwk = √ 2 wTw It’s intuitive that we would want to maximize the distance between the two

<<向上翻页向下翻页>>

点击下载：《机器学习 Machine Learning》课程教学资源（书籍文献）SVM Tutorial