正在加载图片...
Counter*uppercase; 30000 uppercase GetCounter("uppercase"); 20000 map(String name,String contents): for each word w in contents: G二 10000 if (IsCapitalized(w)): uppercase->Increment () 0 EmitIntermediate(w,"1"); 20 40 60 80 100 Seconds The counter values from individual worker machines are periodically propagated to the master(piggybacked Figure 2:Data transfer rate over time on the ping response).The master aggregates the counter values from successful map and reduce tasks and returns them to the user code when the MapReduce operation disks,and a gigabit Ethernet link.The machines were is completed.The current counter values are also dis- arranged in a two-level tree-shaped switched network played on the master status page so that a human can with approximately 100-200 Gbps of aggregate band- watch the progress of the live computation.When aggre- width available at the root.All of the machines were gating counter values,the master eliminates the effects of in the same hosting facility and therefore the round-trip duplicate executions of the same map or reduce task to time between any pair of machines was less than a mil- avoid double counting.(Duplicate executions can arise lisecond. from our use of backup tasks and from re-execution of Out of the 4GB of memory,approximately 1-1.5GB tasks due to failures.) was reserved by other tasks running on the cluster.The Some counter values are automatically maintained programs were executed on a weekend afternoon.when by the MapReduce library,such as the number of in- the CPUs,disks,and network were mostly idle. put key/value pairs processed and the number of output key/value pairs produced. Users have found the counter facility useful for san- 5.2 Grep ity checking the behavior of MapReduce operations.For example,in some MapReduce operations,the user code The grep program scans through 1010 100-byte records may want to ensure that the number of output pairs searching for a relatively rare three-character pattern(the produced exactly equals the number of input pairs pro- pattern occurs in 92,337 records).The input is split into cessed,or that the fraction of German documents pro- approximately 64MB pieces(M=15000),and the en- cessed is within some tolerable fraction of the total num- tire output is placed in one file (R=1). ber of documents processed. Figure 2 shows the progress of the computation over time.The Y-axis shows the rate at which the input data is scanned.The rate gradually picks up as more machines 5 Performance are assigned to this MapReduce computation,and peaks In this section we measure the performance of MapRe- at over 30 GB/s when 1764 workers have been assigned. As the map tasks finish,the rate starts dropping and hits duce on two computations running on a large cluster of zero about 80 seconds into the computation.The entire machines.One computation searches through approxi- mately one terabyte of data looking for a particular pat- computation takes approximately 150 seconds from start to finish.This includes about a minute of startup over- tern.The other computation sorts approximately one ter- head.The overhead is due to the propagation of the pro- abyte of data. gram to all worker machines,and delays interacting with These two programs are representative of a large sub- GFS to open the set of 1000 input files and to get the set of the real programs written by users of MapReduce- information needed for the locality optimization. one class of programs shuffles data from one representa- tion to another,and another class extracts a small amount of interesting data from a large data set. 5.3 Sort 5.1 Cluster Configuration The sort program sorts 1010 100-byte records(approxi- mately 1 terabyte of data).This program is modeled after All of the programs were executed on a cluster that the TeraSort benchmark [10]. consisted of approximately 1800 machines.Each ma- The sorting program consists of less than 50 lines of chine had two 2GHz Intel Xeon processors with Hyper- user code.A three-line Map function extracts a 10-byte Threading enabled,4GB of memory,two 160GB IDE sorting key from a text line and emits the key and the To appear in OSDI 2004Counter* uppercase; uppercase = GetCounter("uppercase"); map(String name, String contents): for each word w in contents: if (IsCapitalized(w)): uppercase->Increment(); EmitIntermediate(w, "1"); The counter values from individual worker machines are periodically propagated to the master (piggybacked on the ping response). The master aggregatesthe counter values from successful map and reduce tasks and returns them to the user code when the MapReduce operation is completed. The current counter values are also dis￾played on the master status page so that a human can watch the progress of the live computation. When aggre￾gating counter values, the master eliminates the effects of duplicate executions of the same map or reduce task to avoid double counting. (Duplicate executions can arise from our use of backup tasks and from re-execution of tasks due to failures.) Some counter values are automatically maintained by the MapReduce library, such as the number of in￾put key/value pairs processed and the number of output key/value pairs produced. Users have found the counter facility useful for san￾ity checking the behavior of MapReduce operations. For example, in some MapReduce operations, the user code may want to ensure that the number of output pairs produced exactly equals the number of input pairs pro￾cessed, or that the fraction of German documents pro￾cessed is within some tolerable fraction of the total num￾ber of documents processed. 5 Performance In this section we measure the performance of MapRe￾duce on two computations running on a large cluster of machines. One computation searches through approxi￾mately one terabyte of data looking for a particular pat￾tern. The other computation sorts approximately one ter￾abyte of data. These two programs are representative of a large sub￾set of the real programs written by users of MapReduce – one class of programs shuffles data from one representa￾tion to another, and another class extracts a small amount of interesting data from a large data set. 5.1 Cluster Configuration All of the programs were executed on a cluster that consisted of approximately 1800 machines. Each ma￾chine had two 2GHz Intel Xeon processors with Hyper￾Threading enabled, 4GB of memory, two 160GB IDE 20 40 60 80 100 Seconds 0 10000 20000 30000 Input (MB/s) Figure 2: Data transfer rate over time disks, and a gigabit Ethernet link. The machines were arranged in a two-level tree-shaped switched network with approximately 100-200 Gbps of aggregate band￾width available at the root. All of the machines were in the same hosting facility and therefore the round-trip time between any pair of machines was less than a mil￾lisecond. Out of the 4GB of memory, approximately 1-1.5GB was reserved by other tasks running on the cluster. The programs were executed on a weekend afternoon, when the CPUs, disks, and network were mostly idle. 5.2 Grep The grep program scans through 1010 100-byte records, searching for a relatively rare three-character pattern (the pattern occurs in 92,337 records). The input is split into approximately 64MB pieces (M = 15000), and the en￾tire output is placed in one file (R = 1). Figure 2 shows the progress of the computation over time. The Y-axis shows the rate at which the input data is scanned. The rate gradually picks up as more machines are assigned to this MapReduce computation, and peaks at over 30 GB/s when 1764 workers have been assigned. As the map tasks finish, the rate starts dropping and hits zero about 80 seconds into the computation. The entire computation takes approximately 150 seconds from start to finish. This includes about a minute of startup over￾head. The overhead is due to the propagation of the pro￾gram to all worker machines, and delays interacting with GFS to open the set of 1000 input files and to get the information needed for the locality optimization. 5.3 Sort The sort program sorts 1010 100-byte records (approxi￾mately 1 terabyte of data). This program is modeled after the TeraSort benchmark [10]. The sorting program consists of less than 50 lines of user code. A three-line Map function extracts a 10-byte sorting key from a text line and emits the key and the To appear in OSDI 2004 8
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有