Furthermore, R is often constrained b_中国高校课件下载中心

点击下载：《大规模数据处理——云计算 Mass Data Processing Cloud Computing》课程教学资源（阅读材料）MapReduce——Simplified Data Processing on Large Clusters

正在加载图片...

Furthermore,R is often constrained by users because the intermediate key.A default partitioning function is the output of each reduce task ends up in a separate out- provided that uses hashing (e.g."hash(key)mod R"). put file.In practice,we tend to choose M so that each This tends to result in fairly well-balanced partitions.In individual task is roughly 16 MB to 64 MB of input data some cases,however,it is useful to partition data by (so that the locality optimization described above is most some other function of the key.For example,sometimes effective),and we make R a small multiple of the num- the output keys are URLs,and we want all entries for a ber of worker machines we expect to use.We often per- single host to end up in the same output file.To support form MapReduce computations with M=200,000 and situations like this,the user of the MapReduce library R=5.000,using 2.000 worker machines. can provide a special partitioning function.For example. using "hash(Hostname(urlkey))mod R"as the par- titioning function causes all URLs from the same host to 3.6 Backup Tasks end up in the same output file One of the common causes that lengthens the total time taken for a MapReduce operation is a"straggler":a ma- 4.2 Ordering Guarantees chine that takes an unusually long time to complete one of the last few map or reduce tasks in the computation. We guarantee that within a given partition,the interme- Stragglers can arise for a whole host of reasons.For ex- diate key/value pairs are processed in increasing key or- ample,a machine with a bad disk may experience fre- der.This ordering guarantee makes it easy to generate quent correctable errors that slow its read performance a sorted output file per partition,which is useful when from 30 MB/s to 1 MB/s.The cluster scheduling sys- the output file format needs to support efficient random tem may have scheduled other tasks on the machine, access lookups by key,or users of the output find it con- causing it to execute the MapReduce code more slowly venient to have the data sorted. due to competition for CPU,memory,local disk,or net- work bandwidth.A recent problem we experienced was 4.3 Combiner Function a bug in machine initialization code that caused proces- sor caches to be disabled:computations on affected ma- In some cases,there is significant repetition in the inter- chines slowed down by over a factor of one hundred. mediate keys produced by each map task,and the user- We have a general mechanism to alleviate the prob- specified Reduce function is commutative and associa- lem of stragglers.When a MapReduce operation is close tive.A good example of this is the word counting exam- to completion,the master schedules backup executions ple in Section 2.1.Since word frequencies tend to follow of the remaining in-progress tasks.The task is marked a Zipf distribution,each map task will produce hundreds as completed whenever either the primary or the backup or thousands of records of the form <the,1>.All of execution completes.We have tuned this mechanism so these counts will be sent over the network to a single re- that it typically increases the computational resources duce task and then added together by the Reduce function used by the operation by no more than a few percent. to produce one number.We allow the user to specify an We have found that this significantly reduces the time optional Combiner function that does partial merging of to complete large MapReduce operations.As an exam- this data before it is sent over the network. ple,the sort program described in Section 5.3 takes 44% The Combiner function is executed on each machine longer to complete when the backup task mechanism is that performs a map task.Typically the same code is used disabled. to implement both the combiner and the reduce func- tions.The only difference between a reduce function and a combiner function is how the MapReduce library han- 4 Refinements dles the output of the function.The output of a reduce function is written to the final output file.The output of Although the basic functionality provided by simply a combiner function is written to an intermediate file that writing Map and Reduce functions is sufficient for most will be sent to a reduce task. needs.we have found a few extensions useful.These are Partial combining significantly speeds up certain described in this section. classes of MapReduce operations.Appendix A contains an example that uses a combiner. 4.1 Partitioning Function 4.4 Input and Output Types The users of MapReduce specify the number of reduce tasks/output files that they desire(R).Data gets parti- The MapReduce library provides support for reading in- tioned across these tasks using a partitioning function on put data in several different formats.For example,"text" To appear in OSDI 2004 6Furthermore, R is often constrained by users because the output of each reduce task ends up in a separate output file. In practice, we tend to choose M so that each individual task is roughly 16 MB to 64 MB of input data (so that the locality optimization described above is most effective), and we make R a small multiple of the number of worker machines we expect to use. We often perform MapReduce computations with M = 200, 000 and R = 5, 000, using 2,000 worker machines. 3.6 Backup Tasks One of the common causes that lengthens the total time taken for a MapReduce operation is a “straggler”: a machine that takes an unusually long time to complete one of the last few map or reduce tasks in the computation. Stragglers can arise for a whole host of reasons. For example, a machine with a bad disk may experience frequent correctable errors that slow its read performance from 30 MB/s to 1 MB/s. The cluster scheduling system may have scheduled other tasks on the machine, causing it to execute the MapReduce code more slowly due to competition for CPU, memory, local disk, or network bandwidth. A recent problem we experienced was a bug in machine initialization code that caused processor caches to be disabled: computations on affected machines slowed down by over a factor of one hundred. We have a general mechanism to alleviate the problem of stragglers. When a MapReduce operation is close to completion, the master schedules backup executions of the remaining in-progress tasks. The task is marked as completed whenever either the primary or the backup execution completes. We have tuned this mechanism so that it typically increases the computational resources used by the operation by no more than a few percent. We have found that this significantly reduces the time to complete large MapReduce operations. As an example, the sort program described in Section 5.3 takes 44% longer to complete when the backup task mechanism is disabled. 4 Refinements Although the basic functionality provided by simply writing Map and Reduce functions is sufficient for most needs, we have found a few extensions useful. These are described in this section. 4.1 Partitioning Function The users of MapReduce specify the number of reduce tasks/output files that they desire (R). Data gets partitioned across these tasks using a partitioning function on the intermediate key. A default partitioning function is provided that uses hashing (e.g. “hash(key) mod R”). This tends to result in fairly well-balanced partitions. In some cases, however, it is useful to partition data by some other function of the key. For example, sometimes the output keys are URLs, and we want all entries for a single host to end up in the same output file. To support situations like this, the user of the MapReduce library can provide a special partitioning function. For example, using “hash(Hostname(urlkey)) mod R” as the partitioning function causes all URLs from the same host to end up in the same output file. 4.2 Ordering Guarantees We guarantee that within a given partition, the intermediate key/value pairs are processed in increasing key order. This ordering guarantee makes it easy to generate a sorted output file per partition, which is useful when the output file format needs to support efficient random access lookups by key, or users of the output find it convenient to have the data sorted. 4.3 Combiner Function In some cases, there is significant repetition in the intermediate keys produced by each map task, and the userspecified Reduce function is commutative and associative. A good example of this is the word counting example in Section 2.1. Since word frequencies tend to follow a Zipf distribution, each map task will produce hundreds or thousands of records of the form <the, 1>. All of these counts will be sent over the network to a single reduce task and then added together by the Reduce function to produce one number. We allow the user to specify an optional Combiner function that does partial merging of this data before it is sent over the network. The Combiner function is executed on each machine that performs a map task. Typically the same code is used to implement both the combiner and the reduce functions. The only difference between a reduce function and a combiner function is how the MapReduce library handles the output of the function. The output of a reduce function is written to the final output file. The output of a combiner function is written to an intermediate file that will be sent to a reduce task. Partial combining significantly speeds up certain classes of MapReduce operations. Appendix A contains an example that uses a combiner. 4.4 Input and Output Types The MapReduce library provides support for reading input data in several different formats. For example, “text” To appear in OSDI 2004 6

<<向上翻页向下翻页>>

点击下载：《大规模数据处理——云计算 Mass Data Processing Cloud Computing》课程教学资源（阅读材料）MapReduce——Simplified Data Processing on Large Clusters