正在加载图片...
vector x.In any event,SMO will work even when n is not positive,in which case the objective function y should be evaluated at each end of the line segment: f=y(E1+b)-C1K(1,x)-S2K(元1,x2), 3=2(E2+b)-S0K(x,x)-x2K(2,x2), L=01+s(02-L), (19) H1=1+s(2-H, 平=L+L+K(元,)+PK(2,)+sLLK(民,元) ΨH=Hf+H所2+HK(x,x)+H2K(E2,x2)+sHH,K(,x2) SMO will move the Lagrange multipliers to the end point that has the lowest value of the objective function.If the objective function is the same at both ends(within a small g for round- off error)and the kernel obeys Mercer's conditions,then the joint minimization cannot make progress.That scenario is described below. 2.2 Heuristics for Choosing Which Multipliers To Optimize As long as SMO always optimizes and alters two Lagrange multipliers at every step and at least one of the Lagrange multipliers violated the KKT conditions before the step,then each step will decrease the objective function according to Osuna's theorem [16].Convergence is thus guaranteed.In order to speed convergence,SMO uses heuristics to choose which two Lagrange multipliers to jointly optimize. There are two separate choice heuristics:one for the first Lagrange multiplier and one for the second.The choice of the first heuristic provides the outer loop of the SMO algorithm.The outer loop first iterates over the entire training set,determining whether each example violates the KKT conditions(12).If an example violates the KKT conditions,it is then eligible for optimization After one pass through the entire training set,the outer loop iterates over all examples whose Lagrange multipliers are neither 0 nor C(the non-bound examples).Again,each example is checked against the KKT conditions and violating examples are eligible for optimization.The outer loop makes repeated passes over the non-bound examples until all of the non-bound examples obey the KKT conditions within 8.The outer loop then goes back and iterates over the entire training set.The outer loop keeps alternating between single passes over the entire training set and multiple passes over the non-bound subset until the entire training set obeys the KKT conditions within g,whereupon the algorithm terminates. The first choice heuristic concentrates the CPU time on the examples that are most likely to violate the KKT conditions:the non-bound subset.As the SMO algorithm progresses,examples that are at the bounds are likely to stay at the bounds,while examples that are not at the bounds will move as other examples are optimized.The SMO algorithm will thus iterate over the non- bound subset until that subset is self-consistent,then SMO will scan the entire data set to search for any bound examples that have become KKT violated due to optimizing the non-bound subset. Notice that the KKT conditions are checked to be within e of fulfillment.Typically,is set to be 103.Recognition systems typically do not need to have the KKT conditions fulfilled to high accuracy:it is acceptable for examples on the positive margin to have outputs between 0.999 and 1.001.The SMO algorithm (and other SVM algorithms)will not converge as quickly if it is required to produce very high accuracy output.8 vector x. In any event, SMO will work even when η is not positive, in which case the objective function Ψ should be evaluated at each end of the line segment: f y E b Kx x s K x x f y E b s Kx x Kx x L sL H sH L f Lf L K x x L K x x sLL K x x H f Hf H K x x L H 1 1 1 1 11 2 12 2 2 2 1 12 2 22 11 2 11 2 11 2 1 2 1 2 1 1 1 2 2 22 1 12 11 2 1 2 1 2 1 1 = +− − = +− − =+ − =+ − = ++ + + = ++ ( ) ( , ) ( , ), ( ) ( , ) ( , ), ( ), ( ), ( , ) ( , ) ( , ), ( , α α α α α α α α rr rr rr rr rr rr rr r r Ψ Ψ ) ( , ) ( , ). + + 1 2 2 H K x x sHH K x x 22 1 12 rr rr (19) SMO will move the Lagrange multipliers to the end point that has the lowest value of the objective function. If the objective function is the same at both ends (within a small ε for round￾off error) and the kernel obeys Mercer’s conditions, then the joint minimization cannot make progress. That scenario is described below. 2.2 Heuristics for Choosing Which Multipliers To Optimize As long as SMO always optimizes and alters two Lagrange multipliers at every step and at least one of the Lagrange multipliers violated the KKT conditions before the step, then each step will decrease the objective function according to Osuna’s theorem [16]. Convergence is thus guaranteed. In order to speed convergence, SMO uses heuristics to choose which two Lagrange multipliers to jointly optimize. There are two separate choice heuristics: one for the first Lagrange multiplier and one for the second. The choice of the first heuristic provides the outer loop of the SMO algorithm. The outer loop first iterates over the entire training set, determining whether each example violates the KKT conditions (12). If an example violates the KKT conditions, it is then eligible for optimization. After one pass through the entire training set, the outer loop iterates over all examples whose Lagrange multipliers are neither 0 nor C (the non-bound examples). Again, each example is checked against the KKT conditions and violating examples are eligible for optimization. The outer loop makes repeated passes over the non-bound examples until all of the non-bound examples obey the KKT conditions within ε. The outer loop then goes back and iterates over the entire training set. The outer loop keeps alternating between single passes over the entire training set and multiple passes over the non-bound subset until the entire training set obeys the KKT conditions within ε, whereupon the algorithm terminates. The first choice heuristic concentrates the CPU time on the examples that are most likely to violate the KKT conditions: the non-bound subset. As the SMO algorithm progresses, examples that are at the bounds are likely to stay at the bounds, while examples that are not at the bounds will move as other examples are optimized. The SMO algorithm will thus iterate over the non￾bound subset until that subset is self-consistent, then SMO will scan the entire data set to search for any bound examples that have become KKT violated due to optimizing the non-bound subset. Notice that the KKT conditions are checked to be within ε of fulfillment. Typically, ε is set to be 10-3. Recognition systems typically do not need to have the KKT conditions fulfilled to high accuracy: it is acceptable for examples on the positive margin to have outputs between 0.999 and 1.001. The SMO algorithm (and other SVM algorithms) will not converge as quickly if it is required to produce very high accuracy output
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有