正在加载图片...
3158 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,VOL 28,NO.11,NOVEMBER 2017 expansion ratio of them.MERP senders insert some flow TABLE 1 states (e.g.,flow size,and expected sending rate)into each Motivating Example packet header,and such kind of states are updated by inter- mediate MERP switches.Each MERP switch maintains a Flow ID Arrival Time Size flow table that contains the state information (e.g.,virtual A 0 deadline)of current active flows that pass through the B 3 C 0 4 switch.Switches synchronize their decisions through the expected sending rate in each packet header.The flow with the earliest virtual deadline always monopolizes link As shown in Fig.la,a short flow (e.g.,A)will monopolize resources;the other flows periodically send detecting pack- the link resources until it is completed or preempted by ets to check whether they can start transmission,and the another shorter flow.When flow A finishes,although flow detecting rate is also effectively controlled by MERP.Exten- C arrives earlier than flow B,flow B will occupy the link sive NS2-based evaluations show that,compared with pFa- resources since it has a smaller remaining flow size than bric [21,MERP reduces the expansion ratio and tail FCT by flow C.The average FCT of SRPT is (3+3+10)/3=5 more than 15 and 12 percent,respectively,with only a negli- Although SRPT can achieve near-optimal performance in gible increase of average FCT. terms of average FCT [21,it may starve long flows.Consider The route control in MERP consists of a novel two-stage Fig.la,if there are a large number of flows with a size of 3 routing strategy,which first selects a subset of next-hop (i.e.,similar to flow A),with SRPT,flow C would be delayed ports that minimize the expansion ratio of the newly-until all these short flows are completed.Performance of coming flow and then chooses one of them to maximize the long flows may be hurt because they are scheduled later sending rate of the tail flow of a path.NS2-based evalua- when competing with short flows.Unfortunately,many tions show that,in multi-path data center topologies (e.g., online services follow the partition-aggregate pattern,in Fat-tree [111),compared with ECMP and CONGA,MERP which constructing the final response requires all intermedi- reduces the average FCT by up to 31 percent when there are ate results from smaller tasks.These intermediate results are link failures. flows with different sizes.Hence,the user-perceived latency We summarize our contributions here as follows. depends on tail FCT as well as average FCT [13].Therefore, SRPT may starve long flows and thus impair user experience. We propose the expansion ratio metric that enables PS.PS focuses on achieving fairness between competing fast flow completion and starvation freedom. flows.If we apply PS to the above example,the switch We design MERP,which controls the sending rates would divide the outgoing link bandwidth equally among of competing flows through synchronizing virtual three flows,as shown in Fig.1b. deadlines stored on intermediate switches,and PS can achieve fairness between competing flows with no routes flows in a downstream-aware manner that need to know their sizes in advance.However,the average reacts quickly to link failures. FCT of PS may be far from the minimum.For example,the We evaluate MERP using extensive NS2-based simu- average FCT of PS in Fig.1b is(7.5+9.5+7)/3=8,which lations that confirm our claims. is much larger than that of SRPT.Therefore,PS may hurt We now continue by motivating our work in Section 2. flow completion efficiency. Section 3 gives an overview of MERP.We present rate con- In summary,current flow scheduling policies can hardly trol in Section 4 and route control in Section 5.We evaluate MERP in Section 6.Before concluding the paper in Section 8, achieve flow completion efficiency (in terms of average FCT)and starvation freedom simultaneously,which are we present related work in Section 7. two primary goals in many applications such as web serv- ers [141.In this paper,we propose to schedule flows using 2 MOTIVATION the expansion ratio metric and design the MERP protocol. We start by presenting an example to demonstrate the drawbacks of SRPT and PS(Section 2.1).We then introduce 2.2 Expansion Ratio and Design Intuition the new metric and design intuitions of MERP(Section 2.2). MERP is intended to minimize average FCT without starv- ing long flows.To realize this,MERP should mimics SRPT 2.1 Drawbacks of SRPT and PS but meanwhile guaranteeing starvation freedom.Therefore, MERP should be able to: Consider the example shown in Table 1,where three flows (A,B,and C)share a common link. make short flows transmit data before long flows, SRPT.SRPT seeks to minimize average FCT [12]:the flow since SRPT prioritizes the flow with the smallest with the smallest remaining size is always transmitted first. remaining size and has near-optimal average FCT; link capacity link capacity link capacity B 4 6 R910 2 4 5.678910 (a)SRPT (b)PS (c)MERP Fig.1.Motivating example.(a)SRPT transmits the flow with the smallest remaining size first.(b)PS shares bandwidth between concurrent flows. (c)MERP schedules flows according to their dynamic expansion ratios.expansion ratio of them. MERP senders insert some flow states (e.g., flow size, and expected sending rate) into each packet header, and such kind of states are updated by inter￾mediate MERP switches. Each MERP switch maintains a flow table that contains the state information (e.g., virtual deadline) of current active flows that pass through the switch. Switches synchronize their decisions through the expected sending rate in each packet header. The flow with the earliest virtual deadline always monopolizes link resources; the other flows periodically send detecting pack￾ets to check whether they can start transmission, and the detecting rate is also effectively controlled by MERP. Exten￾sive NS2-based evaluations show that, compared with pFa￾bric [2], MERP reduces the expansion ratio and tail FCT by more than 15 and 12 percent, respectively, with only a negli￾gible increase of average FCT. The route control in MERP consists of a novel two-stage routing strategy, which first selects a subset of next-hop ports that minimize the expansion ratio of the newly￾coming flow and then chooses one of them to maximize the sending rate of the tail flow of a path. NS2-based evalua￾tions show that, in multi-path data center topologies (e.g., Fat-tree [11]), compared with ECMP and CONGA, MERP reduces the average FCT by up to 31 percent when there are link failures. We summarize our contributions here as follows.  We propose the expansion ratio metric that enables fast flow completion and starvation freedom.  We design MERP, which controls the sending rates of competing flows through synchronizing virtual deadlines stored on intermediate switches, and routes flows in a downstream-aware manner that reacts quickly to link failures.  We evaluate MERP using extensive NS2-based simu￾lations that confirm our claims. We now continue by motivating our work in Section 2. Section 3 gives an overview of MERP. We present rate con￾trol in Section 4 and route control in Section 5. We evaluate MERP in Section 6. Before concluding the paper in Section 8, we present related work in Section 7. 2 MOTIVATION We start by presenting an example to demonstrate the drawbacks of SRPT and PS (Section 2.1). We then introduce the new metric and design intuitions of MERP (Section 2.2). 2.1 Drawbacks of SRPT and PS Consider the example shown in Table 1, where three flows (A, B, and C) share a common link. SRPT. SRPT seeks to minimize average FCT [12]: the flow with the smallest remaining size is always transmitted first. As shown in Fig. 1a, a short flow (e.g., A) will monopolize the link resources until it is completed or preempted by another shorter flow. When flow A finishes, although flow C arrives earlier than flow B, flow B will occupy the link resources since it has a smaller remaining flow size than flow C. The average FCT of SRPT is ð3 þ 3 þ 10Þ=3 ¼ 5 1 3. Although SRPT can achieve near-optimal performance in terms of average FCT [2], it may starve long flows. Consider Fig. 1a, if there are a large number of flows with a size of 3 (i.e., similar to flow A), with SRPT, flow C would be delayed until all these short flows are completed. Performance of long flows may be hurt because they are scheduled later when competing with short flows. Unfortunately, many online services follow the partition-aggregate pattern, in which constructing the final response requires all intermedi￾ate results from smaller tasks. These intermediate results are flows with different sizes. Hence, the user-perceived latency depends on tail FCT as well as average FCT [13]. Therefore, SRPT may starve long flows and thus impair user experience. PS. PS focuses on achieving fairness between competing flows. If we apply PS to the above example, the switch would divide the outgoing link bandwidth equally among three flows, as shown in Fig. 1b. PS can achieve fairness between competing flows with no need to know their sizes in advance. However, the average FCT of PS may be far from the minimum. For example, the average FCT of PS in Fig. 1b is ð7:5 þ 9:5 þ 7Þ=3 ¼ 8, which is much larger than that of SRPT. Therefore, PS may hurt flow completion efficiency. In summary, current flow scheduling policies can hardly achieve flow completion efficiency (in terms of average FCT) and starvation freedom simultaneously, which are two primary goals in many applications such as web serv￾ers [14]. In this paper, we propose to schedule flows using the expansion ratio metric and design the MERP protocol. 2.2 Expansion Ratio and Design Intuition MERP is intended to minimize average FCT without starv￾ing long flows. To realize this, MERP should mimics SRPT but meanwhile guaranteeing starvation freedom. Therefore, MERP should be able to:  make short flows transmit data before long flows, since SRPT prioritizes the flow with the smallest remaining size and has near-optimal average FCT; TABLE 1 Motivating Example Flow ID Arrival Time Size A 03 B 33 C 04 Fig. 1. Motivating example. (a) SRPT transmits the flow with the smallest remaining size first. (b) PS shares bandwidth between concurrent flows. (c) MERP schedules flows according to their dynamic expansion ratios. 3158 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 28, NO. 11, NOVEMBER 2017
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有