5 the entire set of non-zero Lagrange_中国高校课件下载中心

点击下载：北京大学：《模式识别》课程教学资源（参考资料）Sequential Minimal Optimization - A Fast Algorithm for Training Support Vector Machines

正在加载图片...

the entire set of non-zero Lagrange multipliers has been identified,hence the last step solves the large QP problem. Chunking seriously reduces the size of the matrix from the number of training examples squared to approximately the number of non-zero Lagrange multipliers squared.However,chunking still cannot handle large-scale training problems,since even this reduced matrix cannot fit into memory. Chunking Osuna SMO Figure 2.Three alternative methods for training SVMs:Chunking,Osuna's algorithm,and SMO.For each method,three steps are illustrated.The horizontal thin line at every step represents the training set,while the thick boxes represent the Lagrange multipliers being optimized at that step.For chunking,a fixed number of examples are added every step,while the zero Lagrange multipliers are discarded at every step.Thus,the number of examples trained per step tends to grow.For Osuna's algorithm,a fixed number ofexamples are optimized every step:the same number of examples is added to and discarded from the problem at every step.For SMO,only two examples are analytically optimized at every step,so that each step is very fast. In 1997,Osuna,et al.[16]proved a theorem which suggests a whole new set of QP algorithms for SVMs.The theorem proves that the large QP problem can be broken down into a series of smaller QP sub-problems.As long as at least one example that violates the KKT conditions is added to the examples for the previous sub-problem,each step will reduce the overall objective function and maintain a feasible point that obeys all of the constraints.Therefore,a sequence of QP sub-problems that always add at least one violator will be guaranteed to converge.Notice that the chunking algorithm obeys the conditions of the theorem,and hence will converge. Osuna,et al.suggests keeping a constant size matrix for every QP sub-problem,which implies adding and deleting the same number of examples at every step [16](see figure 2).Using a constant-size matrix will allow the training on arbitrarily sized data sets.The algorithm given in Osuna's paper [16]suggests adding one example and subtracting one example every step. Clearly this would be inefficient,because it would use an entire numerical QP optimization step to cause one training example to obey the KKT conditions.In practice,researchers add and subtract multiple examples according to unpublished heuristics [17].In any event,a numerical QP solver is required for all of these methods.Numerical QP is notoriously tricky to get right; there are many numerical precision issues that need to be addressed.5 the entire set of non-zero Lagrange multipliers has been identified, hence the last step solves the large QP problem. Chunking seriously reduces the size of the matrix from the number of training examples squared to approximately the number of non-zero Lagrange multipliers squared. However, chunking still cannot handle large-scale training problems, since even this reduced matrix cannot fit into memory. In 1997, Osuna, et al. [16] proved a theorem which suggests a whole new set of QP algorithms for SVMs. The theorem proves that the large QP problem can be broken down into a series of smaller QP sub-problems. As long as at least one example that violates the KKT conditions is added to the examples for the previous sub-problem, each step will reduce the overall objective function and maintain a feasible point that obeys all of the constraints. Therefore, a sequence of QP sub-problems that always add at least one violator will be guaranteed to converge. Notice that the chunking algorithm obeys the conditions of the theorem, and hence will converge. Osuna, et al. suggests keeping a constant size matrix for every QP sub-problem, which implies adding and deleting the same number of examples at every step [16] (see figure 2). Using a constant-size matrix will allow the training on arbitrarily sized data sets. The algorithm given in Osuna’s paper [16] suggests adding one example and subtracting one example every step. Clearly this would be inefficient, because it would use an entire numerical QP optimization step to cause one training example to obey the KKT conditions. In practice, researchers add and subtract multiple examples according to unpublished heuristics [17]. In any event, a numerical QP solver is required for all of these methods. Numerical QP is notoriously tricky to get right; there are many numerical precision issues that need to be addressed. Chunking Osuna SMO Figure 2. Three alternative methods for training SVMs: Chunking, Osuna’s algorithm, and SMO. For each method, three steps are illustrated. The horizontal thin line at every step represents the training set, while the thick boxes represent the Lagrange multipliers being optimized at that step. For chunking, a fixed number of examples are added every step, while the zero Lagrange multipliers are discarded at every step. Thus, the number of examples trained per step tends to grow. For Osuna’s algorithm, a fixed number of examples are optimized every step: the same number of examples is added to and discarded from the problem at every step. For SMO, only two examples are analytically optimized at every step, so that each step is very fast

<<向上翻页向下翻页>>

点击下载：北京大学：《模式识别》课程教学资源（参考资料）Sequential Minimal Optimization - A Fast Algorithm for Training Support Vector Machines