Introduction Machine Learning for Big_中国高校课件下载中心

点击下载：《大数据 Big Data》课程教学资源（参考文献）Parallel and Distributed Stochastic Learning - Towards Scalable Learning for Big Data Intelligence（南京大学：李武军）

正在加载图片...

Introduction Machine Learning for Big Data For big data applications,first-order methods have become much more popular than other higher-order methods for learning (optimization). Gradient descent methods are the most representative first-order methods. (Deterministic)gradient descent(GD): w+1←L-nm片∑ fi(wt), where t is the iteration number. Linear convergence rate:O(p) Iteration cost is O(n) Stochastic gradient descent(SGD):In the tth iteration,randomly choosing an example it∈{1,2，，n},then update wt+1←-wt-EVfi,(wt) Iteration cost is O(1) The convergence rate is sublinear.O(1/t). Wu-Jun Li (http://cs.nju.edu.cn/lvj) PDSL CS,NJU 4/36Introduction Machine Learning for Big Data For big data applications, first-order methods have become much more popular than other higher-order methods for learning (optimization). Gradient descent methods are the most representative first-order methods. (Deterministic) gradient descent (GD): wt+1 ← wt − ηt [ 1 n Xn i=1 ∇fi(wt)], where t is the iteration number. Linear convergence rate: O(ρ t ) Iteration cost is O(n) Stochastic gradient descent (SGD): In the t th iteration, randomly choosing an example it ∈ {1, 2, ..., n}, then update wt+1 ← wt − ηt∇fit (wt) Iteration cost is O(1) The convergence rate is sublinear: O(1/t) Wu-Jun Li (http://cs.nju.edu.cn/lwj) PDSL CS, NJU 4 / 36

<<向上翻页向下翻页>>