正在加载图片...
Zoya Gavrilov mimw.be"+C∑i6 subject to:yi(wTo(xi)+b)>1-ei and 0(V data points xi). 4 Reformulating as a Lagrangian We can introduce Lagrange multipliers to represent the condition: yi(wT(xi)+b)must be as close to 1 as possible. This condition is captured by:marazowT)+b)] This ensures that when yi(wTo(xi)+b)>1,the expression above is maximal when ai=0(since [1-yi(wTo(xi)+b)]ends up being negative).Otherwise, yi(wTo(xi)+b)<1,so [1-y(wTo(xi)+b)]is a positive value,and the expres- sion is maximal when a;-oo.This has the effect of penalizing any misclassified data points,while assigning 0 penalty to properly classified instances. We thus have the following formulation: mimw.b["+∑:mara4≥0al-贴(wT(x)+b] To allow for slack (soft-margin),preventing the a variables from going to oo,we can impose constraints on the Lagrange multipliers to lie within:0<oi<C. We can define the dual problem by interchanging the max and min as follows (i.e.we minimize after fixing alpha): mara≥o[minw.bJ(w,b:a)]whereJ(w,bsa)=""+∑:al-s(wTp(x)+bj Since we're solving an optimization problem,we set0and discover that the optimal setting of w is(),while setting0yields the con- straint∑:aii=0. Thus,after substituting and simplifying,we get: minw.bJ(w,ba)=∑ia-号∑ijjij(xT(x) And thus our dual is: maxa≥ol∑:a:-量∑aah斯(x)T(x】 Subject to:∑iaih=0and0≤ai≤C4 Zoya Gavrilov minw,b, wTw 2 + C P i i subject to: yi(wT φ(xi) + b) ≥ 1 − i and i ≥ 0 (∀ data points xi). 4 Reformulating as a Lagrangian We can introduce Lagrange multipliers to represent the condition: yi(wT φ(xi) + b) must be as close to 1 as possible. This condition is captured by: maxαi≥0αi [1 − yi(wT φ(xi) + b)] This ensures that when yi(wT φ(xi) + b) ≥ 1, the expression above is maximal when αi = 0 (since [1 − yi(wT φ(xi) + b)] ends up being negative). Otherwise, yi(wT φ(xi) +b) < 1, so [1−yi(wT φ(xi) +b)] is a positive value, and the expres￾sion is maximal when αi → ∞. This has the effect of penalizing any misclassified data points, while assigning 0 penalty to properly classified instances. We thus have the following formulation: minw,b[ wTw 2 + P i maxαi≥0αi [1 − yi(wT φ(xi) + b)]] To allow for slack (soft-margin), preventing the α variables from going to ∞, we can impose constraints on the Lagrange multipliers to lie within: 0 ≤ αi ≤ C. We can define the dual problem by interchanging the max and min as follows (i.e. we minimize after fixing alpha): maxα≥0[minw,bJ(w, b; α)] where J(w, b; α) = wTw 2 + P i αi [1 − yi(wT φ(xi) + b)] Since we’re solving an optimization problem, we set ∂J ∂w = 0 and discover that the optimal setting of w is P i αiyiφ(xi), while setting ∂J ∂b = 0 yields the con￾straint P i αiyi = 0. Thus, after substituting and simplifying, we get: minw,bJ(w, b; α) = P i αi − 1 2 P i,j αiαjyiyjφ(xi) T φ(xj) And thus our dual is: maxα≥0[ P i αi − 1 2 P i,j αiαjyiyjφ(xi) T φ(xj)] Subject to: P i αiyi = 0 and 0 ≤ αi ≤ C
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有