0 100 200 300 400 500 0.9 0.92 0.94 0_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Distributed Stochastic ADMM for Matrix Factorization

正在加载图片...

-DSGD DSGD 一DSGD p=0.0 p=0.01 p=0.03 p=0.03 p=0.02 =0.0 p01 p=0.03 0=0.1 p=02 p=0.05 p=0.2 0=0.3 92 =0=02 0=0.5 0Tme的 Time(s) 20 Time(s) 0 (a)Netflix (b)Yahoo!Music R1 (c)Yahoo!Music R2 Figure 5:The effect of p in DS-ADMM on three data sets. [5]D.Goldfarb,S.Ma,and K.Scheinberg.Fast [19]J.Yang and Y.Zhang.Alternating direction alternating linearization methods for minimizing the algorithms for problems in compressive sensing.SIAM sum of two convex functions.Math.Program.,pages J.Scientific Computing,pages 250-278,2011. 349-382.2013. 20T.Yang.Trading computation for communication: [6]T.Goldstein and S.Osher.The split bregman method Distributed stochastic dual coordinate ascent.In for 11-regularized problems.SIAM J.Imaging NIPS,pages629-637,2013. Sciences,pages 323-343,2009. [21]H.-F.Yu,C.-J.Hsieh,S.Si,and I.S.Dhillon.Scalable [7]B.John,X.-J.Gu,and D.R.Emerson.Investigation coordinate descent approaches to parallel matrix of heat and mass transfer in a lid-driven cavity under factorization for recommender systems.In ICDM. nonequilibrium flow conditions.Numerical Heat pages765-774,2012. Transfer.Part B:Fundamentals,58(5):287-303.2010. [22]Y.Zhen,W.-J.Li,and D.-Y.Yeung.Tagicofi:tag [8]Y.Koren,R.M.Bell,and C.Volinsky.Matrix informed collaborative filtering.In RecSys,pages factorization techniques for recommender systems 69-76.2009. IEEE Computer,42(8):30-37,2009. [23]Y.Zhou,D.M.Wilkinson,R.Schreiber,and R.Pan. [9]W.-J.Li and D.-Y.Yeung.Relation regularized matrix Large-scale parallel collaborative filtering for the factorization.In IJCAI,pages 1126-1131,2009. netflix prize.In AAIM,pages 337-348,2008. [10]Y.Li,M.Yang,and Z.M.Zhang.Scientific articles [24]Y.Zhuang,W.-S.Chin,Y.-C.Juan,and C.-J.Lin.A recommendation.In CIKM,pages 1147-1156,2013. fast parallel sgd for matrix factorization in shared [11]F.Niu,B.Recht,C.Re,and S.J.Wright.Hogwild!: memory systems.In RecSys,pages 249-256,2013. A lock-free approach to parallelizing stochastic [25]M.Zinkevich,M.Weimer,A.J.Smola,and L.Li. gradient descent.In NIPS,pages 693-701,2011 Parallelized stochastic gradient descent.In NIPS. [12]H.Ouyang,N.He,L.Tran,and A.G.Gray. pages2595-2603,2010. Stochastic alternating direction method of multipliers. In ICML,pages 80-88,2013. APPENDIX [13]I.Pilerl'szy,D.Zibriczky,and D.Tikk.Fast als-based matrix factorization for explicit and implicit feedback A.PROOF OF LEMMA 1 datasets.In RecSys,pages 71-78,2010. PROOF.We can rewrite 14]S.Purushotham and Y.Liu.Collaborative topic regression with social matrix factorization for g"(U,VP,nU VP)=>(U,VE,nU.V?), recommendation systems.In ICML.2012. (i,j)EQp 15 T.Suzuki.Dual averaging and proximal gradient where descent for online alternating direction multiplier g(U.,VP;nU,VP) method.In ICML,pages 392-400,2013. 16 C.Wang and D.M.Blei.Collaborative topic modeling =U小，v,l) for recommending scientific articles.In KDD,pages 448-456,2011. +76.(U,V)(Ui-U) [17]H.Wang and A.Banerjee.Online alternating direction method.In ICML,2012. +Vrf([U.vle)(v,-[VP;l) 18]H.Wang,B.Chen,and W.-J.Li.Collaborative topic 1 regression with social regularization for tag +2mlU.4-U小e recommendation.In IJCAI,2013. V -[V 十20 100 200 300 400 500 0.9 0.92 0.94 0.96 0.98 1 Time(s) Test RMSE DSGD ρ=0.01 ρ=0.03 ρ=0.1 ρ=0.2 ρ=0.3 (a) Netflix 0 100 200 300 400 0.83 0.86 0.89 0.92 0.95 0.98 Time(s) Test RMSE DSGD ρ=0.01 ρ=0.02 ρ=0.03 ρ=0.05 ρ=0.2 (b) Yahoo!Music R1 0 200 400 600 800 1 1.1 1.2 1.3 1.4 Time(s) Test RMSE DSGD ρ=0.03 ρ=0.05 ρ=0.1 ρ=0.2 ρ=0.5 (c) Yahoo!Music R2 Figure 5: The effect of ρ in DS-ADMM on three data sets. [5] D. Goldfarb, S. Ma, and K. Scheinberg. Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program., pages 349–382, 2013. [6] T. Goldstein and S. Osher. The split bregman method for l1-regularized problems. SIAM J. Imaging Sciences, pages 323–343, 2009. [7] B. John, X.-J. Gu, and D. R. Emerson. Investigation of heat and mass transfer in a lid-driven cavity under nonequilibrium flow conditions. Numerical Heat Transfer, Part B: Fundamentals, 58(5):287–303, 2010. [8] Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009. [9] W.-J. Li and D.-Y. Yeung. Relation regularized matrix factorization. In IJCAI, pages 1126–1131, 2009. [10] Y. Li, M. Yang, and Z. M. Zhang. Scientific articles recommendation. In CIKM, pages 1147–1156, 2013. [11] F. Niu, B. Recht, C. Re, and S. J. Wright. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. In NIPS, pages 693–701, 2011. [12] H. Ouyang, N. He, L. Tran, and A. G. Gray. Stochastic alternating direction method of multipliers. In ICML, pages 80–88, 2013. [13] I. Pil`eˇrl’szy, D. Zibriczky, and D. Tikk. Fast als-based matrix factorization for explicit and implicit feedback datasets. In RecSys, pages 71–78, 2010. [14] S. Purushotham and Y. Liu. Collaborative topic regression with social matrix factorization for recommendation systems. In ICML, 2012. [15] T. Suzuki. Dual averaging and proximal gradient descent for online alternating direction multiplier method. In ICML, pages 392–400, 2013. [16] C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, pages 448–456, 2011. [17] H. Wang and A. Banerjee. Online alternating direction method. In ICML, 2012. [18] H. Wang, B. Chen, and W.-J. Li. Collaborative topic regression with social regularization for tag recommendation. In IJCAI, 2013. [19] J. Yang and Y. Zhang. Alternating direction algorithms for problems in compressive sensing. SIAM J. Scientific Computing, pages 250–278, 2011. [20] T. Yang. Trading computation for communication: Distributed stochastic dual coordinate ascent. In NIPS, pages 629–637, 2013. [21] H.-F. Yu, C.-J. Hsieh, S. Si, and I. S. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In ICDM, pages 765–774, 2012. [22] Y. Zhen, W.-J. Li, and D.-Y. Yeung. Tagicofi: tag informed collaborative filtering. In RecSys, pages 69–76, 2009. [23] Y. Zhou, D. M. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the netflix prize. In AAIM, pages 337–348, 2008. [24] Y. Zhuang, W.-S. Chin, Y.-C. Juan, and C.-J. Lin. A fast parallel sgd for matrix factorization in shared memory systems. In RecSys, pages 249–256, 2013. [25] M. Zinkevich, M. Weimer, A. J. Smola, and L. Li. Parallelized stochastic gradient descent. In NIPS, pages 2595–2603, 2010. APPENDIX A. PROOF OF LEMMA 1 Proof. We can rewrite g p (U, V p , τt|Ut, V p t ) = X (i,j)∈Ωp g p i,j (U∗i, V p ∗j , τt|Ut, V p t ), where g p i,j (U∗i, V p ∗j ,τt|Ut, V p t ) = ˆfi,j ([U∗i]t, [V p ∗j ]t) + ∇ T U∗i ˆfi,j ([U∗i]t, [V p ∗j ]t)(U∗i − [U∗i]t) + ∇ T Vp ∗j ˆfi,j ([U∗i]t, [V p ∗j ]t)(V p ∗j − [V p ∗j ]t) + 1 2miτt kU∗i − [U∗i]tk 2 F + 1 2nj τt kV p ∗j − [V p ∗j ]tk 2 F

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Distributed Stochastic ADMM for Matrix Factorization