6.CONCLUSION Conference,Budapest,Hungary,September 2-6,1985 In this paper,we present an algorithmic study of the task pages 153-169.Springer,1986. assignment problem in the Hadoop MapReduce framework [6]T.Cormen,C.Leiserson,R.Rivest,and C.Stein. and propose a mathematical model to evaluate the cost of Introduction to algorithms,2nd ed.MIT press task assignments.Based on this model,we show that it is Cambridge,MA.2001. infeasible to find the optimal assignment unless P =AP [7]M.de Kruijf and K.Sankaralingam.MapReduce for Theorem 3.1 shows that the task assignment problem in the Cell B.E.architecture.University of Wisconsin Hadoop remains hard even if all servers have equal capacity Computer Sciences Technical Report CS-TR-2007. of 3,the cost function only has 2 values in its range,and 1625,2007 each data block has at most 2 replicas. [8]J.Dean.Experiences with MapReduce,an abstraction Second,we analyze the simple round robin algorithm for for large-scale computation.In Proceedings of the 15th the UHTA problem.Theorem 4.1 reveals that the intuition International Conference on Parallel Architectures and is wrong that increasing the number of replicas always helps Compilation Techniques.ACM New York,NY,USA. load balancing.Using round robin task assignment,adding 2006. more replicas into the system can sometimes result in worse [9]J.Dean and S.Ghemawat.MapReduce:Simplified maximum load.Theorems 4.2 and 4.3 show there could be data processing on large clusters.Proceedings of the a multiplicative gap in maximum load between the optimal 6th Symposium on Operating Systems Design and assignment and the assignment computed by Algorithm 1. Implementation,San Francisco,CA,pages 137-150, Third,we present Algorithm 2 for the general HTA prob- 2004. lem.This algorithm employs maximum flow and increasing 10 L.R.Ford and D.R.Fulkerson.Maximal flow through threshold techniques.Theorem 5.4 shows that the assign- a network.Canadian Journal of Mathematics ments computed by Algorithm 2 are optimal to within an 8(3):399-404.1956. additive constant that depends only on the number of servers [11 M.R.Garey,D.S.Johnson,et al.Computers and and the remote cost function. Intractability:A Guide to the Theory of There are many interesting directions for future work.We NP-completeness.Freeman San Francisco,1979. have sketched a proof of a matching lower bound to Theo- [12 R.L.Graham.Bounds for certain multiprocessing rem 5.4 for a class of Hadoop cost functions.We plan to anomalies.Bell System Technical Journal. present this result in followup work.Sharing a MapReduce 45(9):1563-1581,1966. cluster between multiple users is becoming popular and has led to recent development of multi-user multi-job schedulers 13 R.L.Graham.Bounds on multiprocessing timing such as fair scheduler and capacity scheduler.We plan to anomalies.SIAM Journal on Applied Mathematics, analyze the performance of such schedulers and see if the pages416-429,1969. optimization techniques from this paper can be applied to [14]B.He,W.Fang,Q.Luo,N.K.Govindaraju,and improve them. T.Wang.Mars:A MapReduce framework on graphics processors.In Proceedings of the 17th International Conference on Parallel Architectures and Compilation 7.ACKNOWLEDGMENTS Techniques,pages 260-269.ACM New York,NY, We would like to thank Avi Silberschatz.Daniel Abadi. USA.2008. Kamil Bajda-Pawlikowski,and Azza Abouzeid for their in- [15]H.W.Kuhn.The Hungarian method for the spiring discussions.We are also grateful to the anonymous assignment problem.Naval Research Logistics,52(1), referees for providing many useful suggestions that signifi- 2005.Originally appeared in Naval Research Logistics cantly improved the quality of our presentation. Quarterly,2,1955,83-97. 16 R.Lammel.Google's MapReduce programming 8.REFERENCES model-Revisited.Science of Computer Programming. 68(3):208-237,2007. [1]J.Aspnes,Y.Azar,A.Fiat,S.Plotkin,and [17]J.K.Lenstra,D.B.Shmoys,and E.Tardos. O.Waarts.On-line routing of virtual circuits with Approximation algorithms for scheduling unrelated applications to load balancing and machine parallel machines.Mathematical Programming, scheduling.Journal of the ACM.44(3):486-504.1997. 46(1):259-271,1990. [2]Y.Azar,J.S.Naor,and R.Rom.The competitiveness 18 C.Ranger,R.Raghuraman,A.Penmetsa,G.Bradski of on-line assignments.In Proceedings of the 3rd and C.Kozyrakis.Evaluating MapReduce for Annual ACM-SIAM symposium on Discrete multi-core and multiprocessor systems.In Proceedings algorithms,pages 203-210.SIAM Philadelphia,PA, of the 2007 IEEE 13th International Symposium on USA.1992. High Performance Computer Architecture,pages [3]K.Birman,G.Chockler,and R.van Renesse.Towards 13-24.IEEE Computer Society Washington,DC a cloud computing research agenda.SIGACT News, USA,2007 40(2):68-80.2009. 19 M.Zaharia,A.Konwinski,A.D.Joseph,R.Katz,and [4]E.Bortnikov.Open-source grid technologies for I.Stoica.Improving MapReduce performance in web-scale computing.SIGACT News,40(2):87-93. heterogeneous environments.In Proceedings of the 8th 2009. Symposium on Operating Systems Design and [5]R.E.Burkard.Assignment problems:Recent solution Implementation,San Diego,CA,2008. methods and applications.In System Modelling and Optimization:Proceedings of the 12th IFIP6. CONCLUSION In this paper, we present an algorithmic study of the task assignment problem in the Hadoop MapReduce framework and propose a mathematical model to evaluate the cost of task assignments. Based on this model, we show that it is infeasible to find the optimal assignment unless P = N P. Theorem 3.1 shows that the task assignment problem in Hadoop remains hard even if all servers have equal capacity of 3, the cost function only has 2 values in its range, and each data block has at most 2 replicas. Second, we analyze the simple round robin algorithm for the UHTA problem. Theorem 4.1 reveals that the intuition is wrong that increasing the number of replicas always helps load balancing. Using round robin task assignment, adding more replicas into the system can sometimes result in worse maximum load. Theorems 4.2 and 4.3 show there could be a multiplicative gap in maximum load between the optimal assignment and the assignment computed by Algorithm 1. Third, we present Algorithm 2 for the general HTA problem. This algorithm employs maximum flow and increasing threshold techniques. Theorem 5.4 shows that the assignments computed by Algorithm 2 are optimal to within an additive constant that depends only on the number of servers and the remote cost function. There are many interesting directions for future work. We have sketched a proof of a matching lower bound to Theorem 5.4 for a class of Hadoop cost functions. We plan to present this result in followup work. Sharing a MapReduce cluster between multiple users is becoming popular and has led to recent development of multi-user multi-job schedulers such as fair scheduler and capacity scheduler. We plan to analyze the performance of such schedulers and see if the optimization techniques from this paper can be applied to improve them. 7. ACKNOWLEDGMENTS We would like to thank Avi Silberschatz, Daniel Abadi, Kamil Bajda-Pawlikowski, and Azza Abouzeid for their inspiring discussions. We are also grateful to the anonymous referees for providing many useful suggestions that signifi- cantly improved the quality of our presentation. 8. REFERENCES [1] J. Aspnes, Y. Azar, A. Fiat, S. Plotkin, and O. Waarts. On-line routing of virtual circuits with applications to load balancing and machine scheduling. Journal of the ACM, 44(3):486–504, 1997. [2] Y. Azar, J. S. Naor, and R. Rom. The competitiveness of on-line assignments. In Proceedings of the 3rd Annual ACM-SIAM symposium on Discrete algorithms, pages 203–210. SIAM Philadelphia, PA, USA, 1992. [3] K. Birman, G. Chockler, and R. van Renesse. Towards a cloud computing research agenda. SIGACT News, 40(2):68–80, 2009. [4] E. Bortnikov. Open-source grid technologies for web-scale computing. SIGACT News, 40(2):87–93, 2009. [5] R. E. Burkard. Assignment problems: Recent solution methods and applications. In System Modelling and Optimization: Proceedings of the 12th IFIP Conference, Budapest, Hungary, September 2-6, 1985, pages 153–169. Springer, 1986. [6] T. Cormen, C. Leiserson, R. Rivest, and C. Stein. Introduction to algorithms, 2nd ed. MIT press Cambridge, MA, 2001. [7] M. de Kruijf and K. Sankaralingam. MapReduce for the Cell B. E. architecture. University of Wisconsin Computer Sciences Technical Report CS-TR-2007, 1625, 2007. [8] J. Dean. Experiences with MapReduce, an abstraction for large-scale computation. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques. ACM New York, NY, USA, 2006. [9] J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Proceedings of the 6th Symposium on Operating Systems Design and Implementation, San Francisco, CA, pages 137–150, 2004. [10] L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canadian Journal of Mathematics, 8(3):399–404, 1956. [11] M. R. Garey, D. S. Johnson, et al. Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman San Francisco, 1979. [12] R. L. Graham. Bounds for certain multiprocessing anomalies. Bell System Technical Journal, 45(9):1563–1581, 1966. [13] R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics, pages 416–429, 1969. [14] B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. Mars: A MapReduce framework on graphics processors. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pages 260–269. ACM New York, NY, USA, 2008. [15] H. W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics, 52(1), 2005. Originally appeared in Naval Research Logistics Quarterly, 2, 1955, 83–97. [16] R. L¨ammel. Google’s MapReduce programming model—Revisited. Science of Computer Programming, 68(3):208–237, 2007. [17] J. K. Lenstra, D. B. Shmoys, and E. Tardos. Approximation algorithms for scheduling unrelated parallel machines. Mathematical Programming, 46(1):259–271, 1990. [18] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for multi-core and multiprocessor systems. In Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pages 13–24. IEEE Computer Society Washington, DC, USA, 2007. [19] M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation, San Diego, CA, 2008