正在加载图片...
where x;is the ith training example,and y;is the correct output of the SVM for the ith training example.The value y,is+1 for the positive examples in a class and-1 for the negative examples. Using a Lagrangian,this optimization problem can be converted into a dual form which is a QP problem where the objective function is solely dependent on a set of Lagrange multipliers a, mna=m22(-,a,-立a (4) (where N is the number of training examples),subject to the inequality constraints, a,≥0,i, (5) and one linear equality constraint, y,0,=0 (6) There is a one-to-one relationship between each Lagrange multiplier and each training example. Once the Lagrange multipliers are determined,the normal vector w and the threshold b can be derived from the Lagrange multipliers: p=∑ya,,b=币-元-y.for some>0. (7) Because w can be computed via equation(7)from the training data before use,the amount of computation required to evaluate a linear SVM is constant in the number of non-zero support vectors. Of course,not all data sets are linearly separable.There may be no hyperplane that splits the positive examples from the negative examples.In the formulation above,the non-separable case would correspond to an infinite solution.However,in 1995,Cortes Vapnik [7]suggested a modification to the original optimization statement(3)which allows,but penalizes,the failure of an example to reach the correct margin.That modification is: mi四f+c25,subject to(--b≥1-5u, (8) where are slack variables that permit margin failure and C is a parameter which trades off wide margin with a small number of margin failures.When this new optimization problem is transformed into the dual form,it simply changes the constraint(5)into a box constraint: 0≤u,≤C,i. (9) The variables do not appear in the dual formulation at all. SVMs can be even further generalized to non-linear classifiers [2].The output of a non-linear SVM is explicitly computed from the Lagrange multipliers: 23 where xi is the ith training example, and yi is the correct output of the SVM for the ith training example. The value yi is +1 for the positive examples in a class and –1 for the negative examples. Using a Lagrangian, this optimization problem can be converted into a dual form which is a QP problem where the objective function Ψ is solely dependent on a set of Lagrange multipliers αi, min ( ) min ( ) , r r r r r α α Ψ= ⋅ − α αα α = = = ∑ ∑ ∑ 1 2 1 1 1 yy x x i j N i N j i j ij i i N (4) (where N is the number of training examples), subject to the inequality constraints, α i ≥ 0, , ∀i (5) and one linear equality constraint, yi i N i = ∑ = 1 α 0. (6) There is a one-to-one relationship between each Lagrange multiplier and each training example. Once the Lagrange multipliers are determined, the normal vector r w and the threshold b can be derived from the Lagrange multipliers: r r rr w y x b wx y i i N = =⋅ − > ii k k = ∑ 1 α α , . for some 0 k (7) Because r w can be computed via equation (7) from the training data before use, the amount of computation required to evaluate a linear SVM is constant in the number of non-zero support vectors. Of course, not all data sets are linearly separable. There may be no hyperplane that splits the positive examples from the negative examples. In the formulation above, the non-separable case would correspond to an infinite solution. However, in 1995, Cortes & Vapnik [7] suggested a modification to the original optimization statement (3) which allows, but penalizes, the failure of an example to reach the correct margin. That modification is: min || || ( ) , , , , r r r rr w b i i N w C y wx b i ii i ξ ξ ξ 1 2 2 1 + ⋅ − ≥− ∀ 1 = ∑ subject to (8) where ξi are slack variables that permit margin failure and C is a parameter which trades off wide margin with a small number of margin failures. When this new optimization problem is transformed into the dual form, it simply changes the constraint (5) into a box constraint: 0 ≤≤∀ α i C, .i (9) The variables ξi do not appear in the dual formulation at all. SVMs can be even further generalized to non-linear classifiers [2]. The output of a non-linear SVM is explicitly computed from the Lagrange multipliers:
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有