正在加载图片...
the execution of jobs across a wide-area network.How- David Kramer,Shun-Tak Leung,and Josh Redstone for ever,there are two fundamental similarities.(1)Both their work in developing GFS.We would also like to systems use redundant execution to recover from data thank Percy Liang and Olcan Sercinoglu for their work loss caused by failures.(2)Both use locality-aware in developing the cluster management system used by scheduling to reduce the amount of data sent across con- MapReduce.Mike Burrows,Wilson Hsieh,Josh Leven- gested network links. berg,Sharon Perl,Rob Pike,and Debby Wallach pro- TACC [7]is a system designed to simplify con- vided helpful comments on earlier drafts of this pa- struction of highly-available networked services.Like per.The anonymous OSDI reviewers,and our shepherd, MapReduce,it relies on re-execution as a mechanism for Eric Brewer,provided many useful suggestions of areas implementing fault-tolerance. where the paper could be improved.Finally,we thank all the users of MapReduce within Google's engineering or- ganization for providing helpful feedback,suggestions, 8 Conclusions and bug reports. The MapReduce programming model has been success- fully used at Google for many different purposes.We References attribute this success to several reasons.First,the model is easy to use,even for programmers without experience [1]Andrea C.Arpaci-Dusseau,Remzi H.Arpaci-Dusseau, with parallel and distributed systems,since it hides the David E.Culler,Joseph M.Hellerstein,and David A.Pat- details of parallelization,fault-tolerance,locality opti- terson.High-performance sorting on networks of work- mization,and load balancing.Second,a large variety stations.In Proceedings of the 1997 ACM SIGMOD In- of problems are easily expressible as MapReduce com- ternational Conference on Management of Data,Tucson. Arizona.May 1997. putations.For example,MapReduce is used for the gen- eration of data for Google's production web search ser- [2] Remzi H.Arpaci-Dusseau,Eric Anderson,Noah vice,for sorting,for data mining,for machine learning, Treuhaft,David E.Culler,Joseph M.Hellerstein,David and many other systems.Third,we have developed an Patterson,and Kathy Yelick.Cluster I/O with River: implementation of MapReduce that scales to large clus- Making the fast case common.In Proceedings of the Sixth ters of machines comprising thousands of machines.The Workshop on Input/Output in Parallel and Distributed Systems(IOPADS '99).pages 10-22.Atlanta,Georgia, implementation makes efficient use of these machine re- May1999. sources and therefore is suitable for use on many of the large computational problems encountered at Google. [3]Arash Baratloo,Mehmet Karaul.Zvi Kedem.and Peter Wyckoff.Charlotte:Metacomputing on the web.In Pro- We have learned several things from this work.First, ceedings of the 9th International Conference on Parallel restricting the programming model makes it easy to par- and Distributed Computing Systems,1996. allelize and distribute computations and to make such computations fault-tolerant.Second,network bandwidth [4]Luiz A.Barroso,Jeffrey Dean,and Urs Holzle.Web is a scarce resource.A number of optimizations in our search for a planet:The Google cluster architecture.IEEE Micro,23(2):22-28.Apil2003. system are therefore targeted at reducing the amount of data sent across the network:the locality optimization al- [5]John Bent,Douglas Thain,Andrea C.Arpaci-Dusseau, lows us to read data from local disks,and writing a single Remzi H.Arpaci-Dusseau,and Miron Livny.Explicit copy of the intermediate data to local disk saves network control in a batch-aware distributed file system.In Pro- bandwidth.Third,redundant execution can be used to ceedings of the Ist USENIX Symposium on Networked Systems Design and Implementation NSDI,March 2004. reduce the impact of slow machines,and to handle ma- chine failures and data loss. [6]Guy E.Blelloch.Scans as primitive parallel operations IEEE Transactions on Computers,C-38(11),November 1989. Acknowledgements [7]Armando Fox.Steven D.Gribble,Yatin Chawathe, Eric A.Brewer,and Paul Gauthier.Cluster-based scal- Josh Levenberg has been instrumental in revising and able network services.In Proceedings of the 16th ACM extending the user-level MapReduce API with a num- Symposium on Operating System Principles,pages 78- ber of new features based on his experience with using 91.Saint-Malo.France,1997. MapReduce and other people's suggestions for enhance- [8]Sanjay Ghemawat,Howard Gobioff,and Shun-Tak Le- ments.MapReduce reads its input from and writes its ung.The Google file system.In 19th Symposium on Op- output to the Google File System [8].We would like to erating Systems Principles,pages 29-43,Lake George. thank Mohit Aron,Howard Gobioff,Markus Gutschke, New York.2003 To appear in OSDI 2004 12the execution of jobs across a wide-area network. How￾ever, there are two fundamental similarities. (1) Both systems use redundant execution to recover from data loss caused by failures. (2) Both use locality-aware scheduling to reduce the amount of data sent across con￾gested network links. TACC [7] is a system designed to simplify con￾struction of highly-available networked services. Like MapReduce, it relies on re-execution as a mechanism for implementing fault-tolerance. 8 Conclusions The MapReduce programming model has been success￾fully used at Google for many different purposes. We attribute this success to several reasons. First, the model is easy to use, even for programmers without experience with parallel and distributed systems, since it hides the details of parallelization, fault-tolerance, locality opti￾mization, and load balancing. Second, a large variety of problems are easily expressible as MapReduce com￾putations. For example, MapReduce is used for the gen￾eration of data for Google’s production web search ser￾vice, for sorting, for data mining, for machine learning, and many other systems. Third, we have developed an implementation of MapReduce that scales to large clus￾ters of machines comprising thousands of machines. The implementation makes efficient use of these machine re￾sources and therefore is suitable for use on many of the large computational problems encountered at Google. We have learned several things from this work. First, restricting the programming model makes it easy to par￾allelize and distribute computations and to make such computations fault-tolerant. Second, network bandwidth is a scarce resource. A number of optimizations in our system are therefore targeted at reducing the amount of data sent across the network: the locality optimization al￾lows us to read data from local disks, and writing a single copy of the intermediate data to local disk saves network bandwidth. Third, redundant execution can be used to reduce the impact of slow machines, and to handle ma￾chine failures and data loss. Acknowledgements Josh Levenberg has been instrumental in revising and extending the user-level MapReduce API with a num￾ber of new features based on his experience with using MapReduce and other people’s suggestions for enhance￾ments. MapReduce reads its input from and writes its output to the Google File System [8]. We would like to thank Mohit Aron, Howard Gobioff, Markus Gutschke, David Kramer, Shun-Tak Leung, and Josh Redstone for their work in developing GFS. We would also like to thank Percy Liang and Olcan Sercinoglu for their work in developing the cluster management system used by MapReduce. Mike Burrows, Wilson Hsieh, Josh Leven￾berg, Sharon Perl, Rob Pike, and Debby Wallach pro￾vided helpful comments on earlier drafts of this pa￾per. The anonymous OSDI reviewers, and our shepherd, Eric Brewer, provided many useful suggestions of areas where the paper could be improved. Finally, we thank all the users of MapReduce within Google’s engineering or￾ganization for providing helpful feedback, suggestions, and bug reports. References [1] Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, and David A. Pat￾terson. High-performance sorting on networks of work￾stations. In Proceedings of the 1997 ACM SIGMOD In￾ternational Conference on Management of Data, Tucson, Arizona, May 1997. [2] Remzi H. Arpaci-Dusseau, Eric Anderson, Noah Treuhaft, David E. Culler, Joseph M. Hellerstein, David Patterson, and Kathy Yelick. Cluster I/O with River: Making the fast case common. In Proceedings of the Sixth Workshop on Input/Output in Parallel and Distributed Systems (IOPADS ’99), pages 10–22, Atlanta, Georgia, May 1999. [3] Arash Baratloo, Mehmet Karaul, Zvi Kedem, and Peter Wyckoff. Charlotte: Metacomputing on the web. In Pro￾ceedings of the 9th International Conference on Parallel and Distributed Computing Systems, 1996. [4] Luiz A. Barroso, Jeffrey Dean, and Urs Holzle. ¨ Web search for a planet: The Google cluster architecture. IEEE Micro, 23(2):22–28, April 2003. [5] John Bent, Douglas Thain, Andrea C.Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Miron Livny. Explicit control in a batch-aware distributed file system. In Pro￾ceedings of the 1st USENIX Symposium on Networked Systems Design and Implementation NSDI, March 2004. [6] Guy E. Blelloch. Scans as primitive parallel operations. IEEE Transactions on Computers, C-38(11), November 1989. [7] Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, and Paul Gauthier. Cluster-based scal￾able network services. In Proceedings of the 16th ACM Symposium on Operating System Principles, pages 78– 91, Saint-Malo, France, 1997. [8] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Le￾ung. The Google file system. In 19th Symposium on Op￾erating Systems Principles, pages 29–43, Lake George, New York, 2003. To appear in OSDI 2004 12
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有