正在加载图片...
4 Introduction to Support Vector Learning To construct this Optimal Hyperplane (cf.figure 1.1),one solves the following optimization problem: minimize r(w)-lwl2 (1.9) subject to y·(w·x)+b)≥1,i=1,,. (1.10) This constrained optimization problem is dealt with by introducing Lagrange Lagrangian multipliers ai>0 and a Lagrangian L(w,6a)=wP-∑a(kw)+)-1). (1.11) =1 The Lagrangian L has to be minimized with respect to the primal variables w and b and maximized with respect to the dual variables ai (i.e.a saddle point has to be found).Let us try to get some intuition for this.If a constraint (1.10)is violated,then yi.((w.xi)+b)-1 0,in which case L can be increased by increasing the corresponding ai.At the same time,w and b will have to change such that L decreases.To prevent -a;(y;((w.x)+b)-1)from becoming arbitrarily large,the change in w and b will ensure that,provided the problem is separable,the constr aint will eventually be satisfied.Similarly,one can understand that for all constr aints which are not precisely met as equalities,i.e. for which yi((wxi)+b)-1 >0,the corresponding a;must be 0:this is the value of oi that maximizes L.The latter is the statement of the Karush-Kuhn- KKT Tucker complementarity conditions of optimization theory (Karush,1939;Kuhn Conditions and Tucker,1951;Bertsekas,1995). The condition that at the saddle point,the derivatives of L with respect to the primal variables must vanish, 品,6o)=0是a=0 dw (1.12) leads to ∑a:=0 (1.13) =1 and w=∑aUx (1.14) The solution vector thus has an expansion in terms of a subset of the training Support Vector patterns,namely those patterns whose a;is non-zero,called Support Vectors.By the Karush-Kuhn-Tucker complementarity conditions a·:(x·w)+b)-1=0,i=1,,, (1.15) the Support Vectors lie on the margin (cf.figure 1.1).All remaining examples of the training set are irrelevant:their constraint (1.10)does not play a role in the optimization,and they do not appear in the expansion (1.14).This nicely 1998/08/251631￾ ￾ ￾  Introduction to Support Vector Learning To construct this Optimal Hyperplane cf￾ gure ￾ one solves the following optimization problem minimize  w   kwk ￾ sub ject to yi  w  xi  b ￾ i  ￾    ￾  ￾ This constrained optimization problem is dealt with by introducing Lagrange Lagrangian multipliers i  and a Lagrangian Lw￾ b￾ ￾   kwk X ￾ i￾ i yi  xi  w  b    ￾ The Lagrangian L has to be minimized with respect to the primal variables w and b and maximized with respect to the dual variables i i￾e￾ a saddle point has to be found￾ Let us try to get some intuition for this￾ If a constraint ￾ is violated then yi  w  xi  b    in which case L can be increased by increasing the corresponding i ￾ At the same time w and b will have to change such that L decreases￾ To prevent i yi  w  xi  b   from becoming arbitrarily large the change in w and b will ensure that provided the problem is separable the constraint will eventually be satis ed￾ Similarly one can understand that for all constraints which are not precisely met as equalities i￾e￾ for which yi  w  xi  b    the corresponding i must be  this is the value of i that maximizes L￾ The latter is the statement of the KarushKuhn KKT Tucker complementarity conditions of optimization theory Karush   Kuhn Conditions and Tucker  Bertsekas ￾ The condition that at the saddle point the derivatives of L with respect to the primal variables must vanish bLw￾ b￾ ￾￾ w Lw￾ b￾ ￾￾ ￾ leads to X ￾ i￾ iyi   ￾ and w  X ￾ i￾ iyixi  ￾ The solution vector thus has an expansion in terms of a subset of the training patterns namely those patterns whose i Support Vector is nonzero called Support Vectors￾ By the KarushKuhnTucker complementarity conditions i  yixi  w  b    ￾ i  ￾    ￾ ￾ ￾ the Support Vectors lie on the margin cf￾ gure ￾￾ All remaining examples of the training set are irrelevant their constraint ￾ does not play a role in the optimization and they do not appear in the expansion ￾￾ This nicely
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有