Random early detection Gateways for Congestion Avoidance Sally floyd and van Jacobson Lawrence Berkeley Laboratory ity of Californ food@ee Ibl. gov van@ee. lbl. gov To appear in the august 1993 IEEE/ACM Transactions on Networking Abstract of a delay-bandwidth product)that were full much of the time; this would significantly increase the average delay This paper presents Random Early Detection(RED)gate- in the network. Therefore, with increasingly high-speed works. The gateway detects incipient congestion by com- that keep throughput high but average queue sizes lor in ways for congestion avoidance in packet-switched net- networks, it is increasingly important to have mechani puting the average queue size. The gateway could notif In the absence of explicit feedback from the gateway, connections of congestion either by dropping packets ar- there are a number of mechanisms that have been pro- riving at the gateway or by setting a bit in packet headers. posed for transport-layer protocols to maintain high through- When the average queue size exceeds a preset threshold, put and low delay in the network. Some of these proposed the gateway drops or marks each arriving packet with a mechanisms are designed to work with current gateways certain probability, where the exact probability is a func- [15, 23, 31, 33, 34 while other mechanisms are cou- REd gateways keep the average queue size low while pled with gateway scheduling algorithms that require per- tion of the average queue size connection state in the gateway [20, 22]. In the absence of allowing occasional bursts of packets in the queue. During explicit feedback from the gateway, transport-layer proto ongestion, the probability that the gateway notifies a par- cols could infer congestion from the estimated bottleneck icular connection to reduce its window is roughly propor- service time, from changes in throughput, from changes tional to that connections share of the bandwidth through in end-to-end delay, as well as from packet drops or other the gateway. RED gateways are designed to accompany a methods. Nevertheless, the view of an individual connec- transport-layer congestion control protocol such as TCP. tion is limited by the timescales of the connection,the The RED gateway has no bias against bursty traffic and traffic pattern of the connection, the lack of knowledge avoids the global synchronization of many connections of the number of congested gateways, the possibilities of decreasing their window at the same time. Simulations of routing changes, as well as by other difficulties in distin- a TCP/IP network are used to illustrate the performance guishing propagation delay from persistent queueing de of RED gateways The most effective detection of congestion can occur 1 Introduction in the gateway itself. The gateway can reliably distin- uish between propagation delay and persistent queueing n high-speed networks with connections with large delay delay. Only the gateway has a unified view of the queue- bandwidth products, gateways are likely to be designed ing behavior over time: the perspective of individual con- with correspondingly large maximum queues to accom- nections is limited by the packet arrival patterns for those modate transient congestion. In the current Internet, the connections. In addition, a gateway is shared by many ac- tive connections with a wide range of roundtrip times, tol- TCP transport protocol detects congestion only after a pa, erances of delay throughput requirements, etc; decisions has been dropped at the gateway. However, it would clearly about the duration and magnitude of transient congestion be undesirable to have large queues( on the order to be allowed at the gateway are best made by the gateway search, Scientific Computing Staff, of the U.S. Department or Energy self. his work was suppo orted by the Director, Office of Energy Re- The method of monitoring the average queue size at under Contract No DE.AC03-76SF00098
Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson Lawrence Berkeley Laboratory University of California floyd@ee.lbl.gov van@ee.lbl.gov To appear in the August 1993 IEEE/ACM Transactions on Networking Abstract This paper presents Random Early Detection (RED) gateways for congestion avoidance in packet-switched networks. The gateway detects incipient congestion by computing the average queue size. The gateway could notify connections of congestion either by dropping packets arriving at the gateway or by setting a bit in packet headers. When the average queue size exceeds a preset threshold, the gateway drops or marks each arriving packet with a certain probability, where the exact probability is a function of the average queue size. RED gateways keep the average queue size low while allowing occasional bursts of packetsin the queue. During congestion, the probability that the gateway notifies a particular connection to reduce its window is roughly proportional to that connection’s share of the bandwidth through the gateway. RED gateways are designed to accompany a transport-layer congestion control protocol such as TCP. The RED gateway has no bias against bursty traffic and avoids the global synchronization of many connections decreasing their window at the same time. Simulations of a TCP/IP network are used to illustrate the performance of RED gateways. 1 Introduction In high-speed networks with connections with large delaybandwidth products, gateways are likely to be designed with correspondingly large maximum queues to accommodate transient congestion. In the current Internet, the TCPtransport protocol detects congestion only after a packet has been dropped atthe gateway. However,it would clearly be undesirable to have large queues (possibly on the order ✁ This work was supported by the Director, Office of Energy Research, Scientific Computing Staff, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098. of a delay-bandwidth product) that were full much of the time; this would significantly increase the average delay in the network. Therefore, with increasingly high-speed networks,it is increasingly importantto have mechanisms that keep throughput high but average queue sizes low. In the absence of explicit feedback from the gateway, there are a number of mechanisms that have been proposed for transport-layer protocolsto maintain high throughput and low delay in the network. Some of these proposed mechanisms are designed to work with current gateways [15, 23, 31, 33, 34], while other mechanisms are coupled with gateway scheduling algorithms that require perconnection state in the gateway [20, 22]. In the absence of explicit feedback from the gateway, transport-layer protocols could infer congestion from the estimated bottleneck service time, from changes in throughput, from changes in end-to-end delay, as well as from packet drops or other methods. Nevertheless, the view of an individual connection is limited by the timescales of the connection, the traffic pattern of the connection, the lack of knowledge of the number of congested gateways, the possibilities of routing changes, as well as by other difficulties in distinguishing propagation delay from persistent queueing delay. The most effective detection of congestion can occur in the gateway itself. The gateway can reliably distinguish between propagation delay and persistent queueing delay. Only the gateway has a unified view of the queueing behavior over time; the perspective of individual connections is limited by the packet arrival patterns for those connections. In addition, a gateway is shared by many active connections with a wide range of roundtrip times, tolerances of delay, throughput requirements, etc.; decisions about the duration and magnitude of transient congestion to be allowed at the gateway are best made by the gateway itself. The method of monitoring the average queue size at 1
the gateway, and of notifying connections of incipient con- critical to avoid this global synchronization gestion, is based of the assumption that it will continue RED gateways can be useful in gateways with a range to be useful to have queues at the gateway where traf- of packet-scheduling and packet-dropping algorithms. For fic from a number of connections is multiplexed together, example, RED congestion control mechanisms could be with FIFO scheduling. Not only is FIFO scheduling use- implemented in gateways with drop preference, where pack ful for sharing delay among connections, reducing delay ets are marked as either"essential"or"optional", and"op- for a particular connection during its periods of burstiness tional packets are dropped first when the queue exceeds a [4, but it scales well and is easy to implement efficiently. certain size. Similarly, for a gateway with separate queues In an alternate approach, some congestion control mech- for realtime and non-realtime traffic, for example, RED anisms that use variants of Fair Queueing [20] or hop-by- congestion control mechanisms could be applied to the hop flow control schemes [22] propose that the gateway queue for one of these traffic classes scheduling algorithm make use of per-connection state for The RED congestion control mechanisms monitor the every active connection. We would suggest instead that average queue size for each output queue, and, using ran- per-connection gateway mechanisms should be used only domization, choose connections to notify of that conges- in those circumstances where gateway scheduling mecha- tion. Transient congestion is accommodated by a tem- nisms without per-connection mechanisms are clearly in- porary increase in the queue. Longer-lived congestion is adequate reflected by an increase in the computed average queue The DECbit congestion avoidance scheme [181, de- size, and results in randomized feedback to some of the scribed later in this paper, is an early example of con- connections to decrease their windows. The probabili gestion detection at the gateway; DECbit gateways give that a connection is notified of congestion is proportional explicit feedback when the average queue size exceeds a to that connections share of the throughput through the certain threshold. This paper proposes a different con- gateway. gestion avoidance mechanism at the gateway, RED(Ran- Gateways that detect congestion before the queue over dom Early Detection) gateways, with somewhat different flows are not limited to packet drops as the method for methods for detecting congestion and for choosing which notifying connections of congestion. RED gateways can connections to notify of this congestion. mark a packet by dropping it at the gateway or by setting While the principles behind red gateways are fairly a bit in the packet header, depending on the transport pro general, and red gateways can be useful in controlling tocol. When the average queue size exceeds a maximum the average queue size even in a network where the trans- threshold, the red gateway marks every packet that ar- port protocol can not be trusted to be cooperative, rEd rives at the gateway. If RED gateways mark packets by gateways are intended for a network where the transport dropping them, rather than by setting a bit in the packet protocol responds to congestion indications from the net- header, when the average queue size exceeds the maxi- work. The gateway congestion control mechanism in RED mum threshold, then the red gateway controls the aver gateways simplifies the congestion control job required of age queue size even in the absence of a cooperating trans the transport protocol, and should be applicable to transport- port protocol layer congestion control mechanisms other than the cur- One advantage of a gateway congestion control mech- rent version of TCP, including protocols with rate-based anism that works with current transport protocols, and that rather than window-based flow control does not require that all gateways in the internet use the However, some aspects of REd gateways are specifi- same gateway congestion control mechanism, is that it cally targeted to TCP/IP networks. The rEd gateway is could be deployed gradually in the current Internet. RED designed for a network where a single marked or dropped gateways are a simple mechanism for congestion avoid- packet is sufficient to signal the presence of congestion ance that could be implemented gradually in current TCP/IP to the transport-layer protocol. This is different from the networks with no changes to transport protocols DECbit congestion control scheme, where the transport- Section 2 discusses previous research on Early Ran- ayer protocol computes the fraction of arriving packets dom Drop gateways and other congestion avoidance gate that have the congestion indication bit set ways. Section 3 outlines design guidelines for RED gate In addition, the emphasis on avoiding the global syn- ways. Section 4 presents the red gateway alge chronization that results from many connections reducing Section 5 describes simple simulations. Section 6 dis- heir windows at the same time is particularly relevant in a cusses in detail the parameters used in calculating the av- network with 4.3-Tahoe BSD TCP [141, where each con- erage queue size, and Section 7 discusses the algorithm nection goes through Slow-Start, reducing the window to used in calculating the packet-marking probability one, in response to a dropped packet. In the dECbit con- Section 8 examines the performance of RED gateways gestion control scheme, for example, where each connec- including the robustness of rEd gateways for a range tions response to congestion is less severe, it is also less of traffic and for a range of parameter values. Simula- 2
the gateway, and of notifying connections of incipient congestion, is based of the assumption that it will continue to be useful to have queues at the gateway where traf- fic from a number of connections is multiplexed together, with FIFO scheduling. Not only is FIFO scheduling useful for sharing delay among connections, reducing delay for a particular connection during its periods of burstiness [4], but it scales well and is easy to implement efficiently. In an alternate approach, some congestion control mechanisms that use variants of Fair Queueing [20] or hop-byhop flow control schemes [22] propose that the gateway scheduling algorithm make use of per-connection state for every active connection. We would suggest instead that per-connection gateway mechanisms should be used only in those circumstances where gateway scheduling mechanisms without per-connection mechanisms are clearly inadequate. The DECbit congestion avoidance scheme [18], described later in this paper, is an early example of congestion detection at the gateway; DECbit gateways give explicit feedback when the average queue size exceeds a certain threshold. This paper proposes a different congestion avoidance mechanism at the gateway, RED (Random Early Detection) gateways, with somewhat different methods for detecting congestion and for choosing which connections to notify of this congestion. While the principles behind RED gateways are fairly general, and RED gateways can be useful in controlling the average queue size even in a network where the transport protocol can not be trusted to be cooperative, RED gateways are intended for a network where the transport protocol responds to congestion indications from the network. The gateway congestion control mechanism in RED gateways simplifies the congestion control job required of the transport protocol, and should be applicable to transportlayer congestion control mechanisms other than the current version of TCP, including protocols with rate-based rather than window-based flow control. However, some aspects of RED gateways are specifi- cally targeted to TCP/IP networks. The RED gateway is designed for a network where a single marked or dropped packet is sufficient to signal the presence of congestion to the transport-layer protocol. This is different from the DECbit congestion control scheme, where the transportlayer protocol computes the fraction of arriving packets that have the congestion indication bit set. In addition, the emphasis on avoiding the global synchronization that results from many connections reducing their windows at the same time is particularly relevantin a network with 4.3-Tahoe BSD TCP [14], where each connection goes through Slow-Start, reducing the window to one, in response to a dropped packet. In the DECbit congestion control scheme, for example, where each connection’s response to congestion is less severe, it is also less critical to avoid this global synchronization. RED gateways can be useful in gateways with a range of packet-scheduling and packet-dropping algorithms. For example, RED congestion control mechanisms could be implemented in gateways with drop preference, where packets are marked as either “essential” or “optional”, and “optional” packets are droppedfirst when the queue exceeds a certain size. Similarly, for a gateway with separate queues for realtime and non-realtime traffic, for example, RED congestion control mechanisms could be applied to the queue for one of these traffic classes. The RED congestion control mechanisms monitor the average queue size for each output queue, and, using randomization, choose connections to notify of that congestion. Transient congestion is accommodated by a temporary increase in the queue. Longer-lived congestion is reflected by an increase in the computed average queue size, and results in randomized feedback to some of the connections to decrease their windows. The probability that a connection is notified of congestion is proportional to that connection’s share of the throughput through the gateway. Gatewaysthat detect congestion before the queue over- flows are not limited to packet drops as the method for notifying connections of congestion. RED gateways can mark a packet by dropping it at the gateway or by setting a bit in the packet header, depending on the transport protocol. When the average queue size exceeds a maximum threshold, the RED gateway marks every packet that arrives at the gateway. If RED gateways mark packets by dropping them, rather than by setting a bit in the packet header, when the average queue size exceeds the maximum threshold, then the RED gateway controls the average queue size even in the absence of a cooperating transport protocol. One advantage of a gateway congestion control mechanism that works with currenttransport protocols, and that does not require that all gateways in the internet use the same gateway congestion control mechanism, is that it could be deployed gradually in the current Internet. RED gateways are a simple mechanism for congestion avoidance that could be implemented gradually in current TCP/IP networks with no changes to transport protocols. Section 2 discusses previous research on Early Random Drop gateways and other congestion avoidance gateways. Section 3 outlines design guidelines for RED gateways. Section 4 presents the RED gateway algorithm, and Section 5 describes simple simulations. Section 6 discusses in detail the parameters used in calculating the average queue size, and Section 7 discusses the algorithm used in calculating the packet-marking probability. Section 8 examinesthe performance of RED gateways, including the robustness of RED gateways for a range of traffic and for a range of parameter values. Simula- 2
tions in Section 9 demonstrate, among other things, the full then the gateway drops each arriving packet with prob. RED gateways lack of bias against bursty traffic. Sec- ability 0.02. Zhang [36] shows that this version of Early tion 10 describes how RED gateways can be used to iden- Random Drop gateways was not successful in controlling ify those users that are using a large fraction of the band- misbehaving users. In these simulations, with both Ran- width through a congested gateway. Section 11 discusses dom Drop and Early Random Drop gateways, the mis methods for efficiently implementing RED gateways. Sec- behaving users received roughly 75% higher throughput tion 12 gives conclusions and describes areas for future than the users implementing standard 4.3 BSD TCP The Gateway Congestion Control Survey [21] consid- ers the versions of Early Random Drop described above he survey cites the results in which the Early Random 2 Previous work on congestion avoid-Drop gateway is unsuccessful in controlling misbehaving ance gateways users [36]. As mentioned in [32], Early Random Drop gateways are not expected to solve all of the problems 2.1 Early random Drop gateways of unequal throughput given connections with different roundtrip times and multiple congested gateways. In [21] Several researchers have studied Early Random Drop gate- the goals of Early Random Drop gateways for congestion ways as a method for providing congestion avoidance at avoidance are described as"uniform, dynamic treatment he gateway of users(streams/flows), of low overhead, and of good Hashem [11] discusses some of the shortcomings of scaling characteristics in large and loaded networks". It is Random Drop- and Drop Tail gateways, and briefly in- left as an open question whether or not these goals can be vestigates Early Random Drop gateways. In the imple- achieved mentation of Early Random Drop gateways in [1l], if the say drops each packet arriving at the gateway with a fixed 2.2 Other approaches to gateway mechanisms op probability. This is discussed as a rough initial im- for congestion avoidance plementation. Hashem [11] stresses that in future imple- Early descriptions of IP Source Quench messages sug mentations the drop level and the drop probability should gest that gateways could send Source Quench messages be adjusted dynamically, depending on network traffic. to source hosts before the buffer space at the gateway Hashem [ll] points out that with Drop Tail gateways reaches capacity [26], and before packets have to be dropped each congestion period introduces global synchronization at the gateway. One proposal [27] suggests that the gate in the network. When the queue overflows, packets are way send Source Quench messages when the queue size often dropped from several connections, and these con- exceeds a certain threshold, and outlines a possible method nections decrease their windows at the same time. This for flow control at the source hosts in response to these results in a loss of throughput at the gateway. The paper messages. The proposal also suggests that when the gate shows that Early Random Drop gateways have a broader way queue size approaches the maximum level the gate view of traffic distribution than do Drop Tail or Random way could discard arriving packets other than ICMP pack Drop gateways and reduce global synchronization. The ets paper suggests that because of this broader view of traf- The dECbit congestion avoidance scheme, a binary fic distribution, Early Random Drop gateways have a bet- feedback scheme for congestion avoidance is described ter chance than Drop Tail gateways of targeting aggres- in [29]. In the DECbit scheme the gateway uses a congestion ive users. The conclusions in 11] are that Early Random indication bit in packet headers to provide feedback about Drop gateways deserve further investigation congestion in the network. When a packet arrives at the For the version of Early Random Drop gateways used gateway, the gateway calculates the average queue length the simulations in [36], if the queue is more than half for the last(busy idle)period plus the current busy pe- Jacobson[14 riod. (The gateway is busy when it is transmitting packets, to detect incipient congestion, and to randomly drop packets when con- and idle otherwise. )When the average queue length ex- stion is detected. These proposed gateways are a precursor to the Early ceeds one, then the gateway sets the congestion-indication Random Drop gateways that have been studied by several authors bit in the packet header of arriving packets RED gateways. RED gateways differ from the ear The source uses window flow control, and updates its size is measured; window once every two roundtrip times. If at least half of is not limited to dropping packets, and the packet-marking the packets in the last window had the congestion indica- probability is a function of the average and the queue is full, the gateway randomly chooses a packet from the Otherwise, the window is increased lineary ponentially With Random Drop gateways, when a packet arrives at the gateway tion bit set, then the window is decreased There are several significant differences between deCbit
tions in Section 9 demonstrate, among other things, the RED gateway’s lack of bias against bursty traffic. Section 10 describes how RED gateways can be used to identify those users that are using a large fraction of the bandwidth through a congested gateway. Section 11 discusses methodsfor efficiently implementing RED gateways. Section 12 gives conclusions and describes areas for future work. 2 Previous work on congestion avoidance gateways 2.1 Early Random Drop gateways Several researchers have studied Early Random Drop gateways as a method for providing congestion avoidance at the gateway. 1 Hashem [11] discusses some of the shortcomings of Random Drop2 and Drop Tail gateways, and briefly investigates Early Random Drop gateways. In the implementation of Early Random Drop gateways in [11], if the queue length exceeds a certain drop level, then the gateway drops each packet arriving atthe gateway with a fixed drop probability. This is discussed as a rough initial implementation. Hashem [11] stresses that in future implementations the drop level and the drop probability should be adjusted dynamically, depending on network traffic. Hashem [11] points out that with Drop Tail gateways each congestion period introduces global synchronization in the network. When the queue overflows, packets are often dropped from several connections, and these connections decrease their windows at the same time. This results in a loss of throughput at the gateway. The paper shows that Early Random Drop gateways have a broader view of traffic distribution than do Drop Tail or Random Drop gateways and reduce global synchronization. The paper suggests that because of this broader view of traf- fic distribution, Early Random Drop gateways have a better chance than Drop Tail gateways of targeting aggressive users. The conclusions in [11] are that Early Random Drop gateways deserve further investigation. For the version of Early Random Drop gateways used in the simulations in [36], if the queue is more than half 1 Jacobson [14] proposed gateways to monitor the average queue size to detect incipient congestion, and to randomly drop packets when congestion is detected. These proposed gateways are a precursor to the Early Random Drop gateways that have been studied by several authors [11] [36]. We refer to the gateways in this paper as Random Early Detection or RED gateways. RED gateways differ from the earlier Early Random Drop gateways in several respects: the average queue size is measured; the gateway is not limited to dropping packets; and the packet-marking probability is a function of the average queue size. 2With Random Drop gateways, when a packet arrives at the gateway and the queue is full, the gateway randomly chooses a packet from the gateway queue to drop. fullthen the gateway drops each arriving packet with probability 0.02. Zhang [36] shows that this version of Early Random Drop gateways was not successful in controlling misbehaving users. In these simulations, with both Random Drop and Early Random Drop gateways, the misbehaving users received roughly 75% higher throughput than the users implementing standard 4.3 BSD TCP. The Gateway Congestion Control Survey [21] considers the versions of Early Random Drop described above. The survey cites the results in which the Early Random Drop gateway is unsuccessful in controlling misbehaving users [36]. As mentioned in [32], Early Random Drop gateways are not expected to solve all of the problems of unequal throughput given connections with different roundtrip times and multiple congested gateways. In [21], the goals of Early Random Drop gateways for congestion avoidance are described as “uniform, dynamic treatment of users (streams/flows), of low overhead, and of good scaling characteristics in large and loaded networks”. It is left as an open question whether or not these goals can be achieved. 2.2 Other approachesto gateway mechanisms for congestion avoidance Early descriptions of IP Source Quench messages suggest that gateways could send Source Quench messages to source hosts before the buffer space at the gateway reaches capacity [26], and before packets have to be dropped at the gateway. One proposal [27] suggests that the gateway send Source Quench messages when the queue size exceeds a certain threshold, and outlines a possible method for flow control at the source hosts in response to these messages. The proposal also suggests that when the gateway queue size approaches the maximum level the gateway could discard arriving packets other than ICMPpackets. The DECbit congestion avoidance scheme, a binary feedback scheme for congestion avoidance, is described in [29]. In the DECbitscheme the gateway uses a congestionindication bit in packet headers to provide feedback about congestion in the network. When a packet arrives at the gateway, the gateway calculates the average queue length for the last (busy + idle) period plus the current busy period. (The gateway is busy when it is transmitting packets, and idle otherwise.) When the average queue length exceeds one,then the gateway sets the congestion-indication bit in the packet header of arriving packets. The source uses window flow control, and updates its window once every two roundtrip times. If at least half of the packets in the last window had the congestion indication bit set, then the window is decreased exponentially. Otherwise, the window is increased linearly. There are severalsignificant differences between DECbit 3
ateways and the red gateways described in this paper. work responds to the instantaneous queue lengths, not to The first difference concerns the method of computing the the average queue lengths. We believe that this scheme average queue size. Because the dECbit scheme chooses would be vulnerable to traffic phase effects and to biases he last(busy idle)cycle plus the current busy period against bursty traffic, and would not accommodate tran- for averaging the queue size, the queue size can some- sient increases in the queue size times be averaged over a fairly short period of time. In would be desirable to explicitly control the time constant 3 Design guidelines for the computed average queue size; this is done in RED gateways using time-based exponential decay. In(29) the This section summarizes some of the design goals and exponential running average of the queue lengthbeshted guidelines for RED gateways. The main goal is to provide authors report that they rejected the idea of cause congestion avoidance by controlling the average queue when the time interval was far from the roundtrip time, size. Additional goals include the avoidance of global there was bias in the network. This problem of bias does synchronization and of a bias against bursty trafic and the not arise with red gateways because red gateways use ability to maintain an upper bound on the average queue a randomized algorithm for marking packets, and assume size even in the absence of cooperation from transport that the sources use a different algorithm for responding layer protocol to marked packets In a DECbit network, the source looks The first job of a congestion avoidance mechanism at at the fraction of packets that have been marked in the last the gateway is to detect incipient congestion. As defined roundtrip time. For a network with RE source should reduce its window even if there is only one work in a region of low delay and high throughput.The marked packet average queue size should be kept low, while fluctuations A second difference between DECbit gateways and in the actual queue size should be allowed to accommo- RED gateways concerns the method for choosing connec- date bursty traffic and transient congestion. Because the tions to notify of congestion. In the DECbit scheme there gateway can monitor the size of the queue over time, the is no conceptual separation between the algorithm to de- gateway is the appropriate agent to detect incipient con tect congestion and the algorithm to set the congestion in- gestion. Because the gateway has a unified view of the dication bit. When a packet arrives at the gateway and the various sources contributing to this congestion, the gate- computed average queue size is too high, the congestion way is also the appropriate agent to decide which sources indication bit is set in the header of that packet. Because to notify of this congestion of this method for marking packets, DECbit networks can. n anetwork with connections with arange of roundtrip exhibit a bias against bursty traffic ( see Section 9] this is times, throughput requirements, and delay sensitivities, avoided in reD gateways by using randomization in the the gateway is the most appropriate agent to determine the method for marking packets. For congestion avoidance sIze and duration of short-lived bursts in queue size to be gateways designed to work with TCP, an additional moti- accommodated by the gateway. The gateway can do this vation for using randomization in the method for marking by controlling the time constants used by the low-pass fil packets is to avoid the global synchronization that results ter for computing the average queue size. The goal of the from many TCP connections reducing their window at the gateway is to detect incipient congestion that has persisted same time. This is less of a concern in networks with the for a"long time"(several roundtrip times) DECbit congestion avoidance scheme, where each source The second job of a congestion avoidance gateway is decreases its window fairly moderately in response to con- to decide which connections to notify of congestion at Another proposal for adaptive window schemes where way buffer is full, it is not necessary for the gateau q gestion the gateway. If congestion is detected before the ga the source nodes increase or decrease their windows ac- drop packets to notify sources of congestion. In this pa- cording to feedback concerning the queue lengths at the per, we say that the gateway marks a packet, and notifies gateways is presented in[25]. Each gateway has an upper the source to reduce the window for that connection.This LT indicating light load conditions. Information about the setting a bit in a packet header sist of dropping a packet, threshold UT indicating congestion, and a lower threshold marking and notification can c queue sizes at the gateways is added to each packet. a derstood by the transport protocol. The current feedback ource node increases its window only if all the gateway mechanism in TCP/IP networks is for the gateway to drop queue lengths in the path are below the lower thresholds. packets, and the simulations of RED gateways in this pa- If the queue length is above the upper threshold for any per use this approach queue along the path, then the source node decreases its One goal is to avoid a bias against bursty traffic. Net window. One disadvantage of this proposal is that the net- works contain connections with a range of burstiness, and
gateways and the RED gateways described in this paper. The first difference concerns the method of computing the average queue size. Because the DECbit scheme chooses the last (busy + idle) cycle plus the current busy period for averaging the queue size, the queue size can sometimes be averaged over a fairly short period of time. In high-speed networks with large buffers at the gateway, it would be desirable to explicitly control the time constant for the computed average queue size; this is done in RED gateways using time-based exponential decay. In [29] the authors report that they rejected the idea of a weighted exponential running average of the queue length because when the time interval was far from the roundtrip time, there was bias in the network. This problem of bias does not arise with RED gateways because RED gateways use a randomized algorithm for marking packets, and assume that the sources use a different algorithm for responding to marked packets. In a DECbit network, the source looks at the fraction of packets that have been marked in the last roundtrip time. For a network with RED gateways, the source should reduce its window even if there is only one marked packet. A second difference between DECbit gateways and RED gateways concerns the method for choosing connections to notify of congestion. In the DECbit scheme there is no conceptual separation between the algorithm to detect congestion and the algorithm to set the congestion indication bit. When a packet arrives at the gateway and the computed average queue size is too high, the congestion indication bit is set in the header of that packet. Because of this method for marking packets, DECbit networks can exhibit a bias against bursty traffic [see Section 9]; this is avoided in RED gateways by using randomization in the method for marking packets. For congestion avoidance gateways designed to work with TCP, an additional motivation for using randomization in the method for marking packets is to avoid the global synchronization that results from many TCPconnections reducing their window at the same time. This is less of a concern in networks with the DECbit congestion avoidance scheme, where each source decreasesits window fairly moderately in response to congestion. Another proposal for adaptive window schemes where the source nodes increase or decrease their windows according to feedback concerning the queue lengths at the gateways is presented in [25]. Each gateway has an upper threshold UT indicating congestion, and a lower threshold LT indicating light load conditions. Information about the queue sizes at the gateways is added to each packet. A source node increases its window only if all the gateway queue lengths in the path are below the lower thresholds. If the queue length is above the upper threshold for any queue along the path, then the source node decreases its window. One disadvantage of this proposal is thatthe network responds to the instantaneous queue lengths, not to the average queue lengths. We believe that this scheme would be vulnerable to traffic phase effects and to biases against bursty traffic, and would not accommodate transient increases in the queue size. 3 Design guidelines This section summarizes some of the design goals and guidelines for RED gateways. The main goal is to provide congestion avoidance by controlling the average queue size. Additional goals include the avoidance of global synchronization and of a bias against bursty traffic and the ability to maintain an upper bound on the average queue size even in the absence of cooperation from transportlayer protocols. The first job of a congestion avoidance mechanism at the gateway is to detect incipient congestion. As defined in [18], a congestion avoidance scheme maintains the network in a region of low delay and high throughput. The average queue size should be kept low, while fluctuations in the actual queue size should be allowed to accommodate bursty traffic and transient congestion. Because the gateway can monitor the size of the queue over time, the gateway is the appropriate agent to detect incipient congestion. Because the gateway has a unified view of the various sources contributing to this congestion, the gateway is also the appropriate agent to decide which sources to notify of this congestion. In a network with connections with a range of roundtrip times, throughput requirements, and delay sensitivities, the gateway is the most appropriate agent to determine the size and duration of short-lived bursts in queue size to be accommodated by the gateway. The gateway can do this by controlling the time constants used by the low-pass filter for computing the average queue size. The goal of the gateway is to detectincipient congestion that has persisted for a “long time” (several roundtrip times). The second job of a congestion avoidance gateway is to decide which connections to notify of congestion at the gateway. If congestion is detected before the gateway buffer is full, it is not necessary for the gateway to drop packets to notify sources of congestion. In this paper, we say that the gateway marks a packet, and notifies the source to reduce the window for that connection. This marking and notification can consist of dropping a packet, setting a bit in a packet header, or some other method understood by the transport protocol. The current feedback mechanism in TCP/IP networks is for the gateway to drop packets, and the simulations of RED gateways in this paper use this approach. One goal is to avoid a bias against bursty traffic. Networks contain connections with a range of burstiness, and 4
ateways such as Drop Tail and Random Drop gateways mum and the maximum threshold, each arriving packet have a bias against bursty traffic. With Drop Tail gate- is marked with probability Pa, where Pa is a function of ways, the more bursty the traffic from a particular con- the average queue size aug. Each time that a packet is ection, the more likely it is that the gateway queue will marked, the probability that a packet is marked from a overfow when packets from that connection arrive at the particular connection is roughly proportional to that con gateway [71 nection's share of the bandwidth at the gateway. The gen- Another goal in deciding which connections to notify eral rED gateway algorithm is given in Figure 1 of congestion is to avoid the global synchronization that esults from notifying all connections to reduce their win- for each packet arrival dows at the same time. Global synchronization has been calculate the average queue size aug studied in networks with Drop Tail gateways[371, and re- if minth <avg ma. sults in loss of throughput in the network. Synchroniza alculate probability pa tion as a general network phenomena has been explored in[81 mark the arriving packet In order to avoid problems such as biases against bursty else if matth <avg traffic and global synchronization, congestion avoidance mark the arriving packet gateways can use distinct algorithms for congestion de tection and for deciding which connections to notify of this congestion. The red gateway uses randomization in Figure 1: General algorithm for RED gateways hoosing which arriving packets to mark; with this method, the probability of marking a packet from a particular con- Thus the red gateway has two separate algorithms nection is roughly proportional to that connection's share he algorithm for computing the average queue size deter- of the bandwidth through the gateway. This method can mines the degree of burstiness that will be allowed in the be efficiently implemented without maintaining per-connectigateway queue. The algorithm for calculating the packet- state at the gateway rking probability determines how frequently the gate One goal for a congestion avoidance gateway is the way marks packets, given the current level of congestion ability to control the average queue size even in the ab- The goal is for the gateway to mark packets at fairly evenly sence of cooperating sources. This can be done if the Spaced intervals, in order to avoid biases and to avoid ateway drops arriving packets when the average queue frequenty to control the average queue size sufficienty size exceeds some maximum threshold (rather than set- The detailed algorithm for the RED gateway is given ting a bit in the packet header). This method could be used to control the average queue size even if most con- in Figure 2. Section 11 discusses efficient implementa- nections last less than a roundtrip time(as could occur tions of these algorithms The gateways calculations of the average queue size with modified transport protocols in increasingly high- take into account the period when the queue is empty(the speed networks), and even if connections fail to reduce idle period) by estimating the number m of small packets their throughput in response to marked or dropped pack: that could have been transmitted by the gateway during et the idle period. After the idle period the gateway com putes the average queue size as if m packets had arrived 4 The red algorithm to an empty queue during that period As aug varies from minth to maTth, the packet-marking This section describes the algorithm for RED gateways. probability po varies linearly from o to maap The red gateway calculates the average queue size,us- ing a low-pass filter with an exponential weighted mov P+matp(aug-minth/(marth -minth ing average. The average queue size is compared to two The final packet-marking probability pa increases slowly thresholds, a minimum threshold and a maximum thresh- as the count increases since the last marked packe old. When the average queue size is less than the min- imum threshold, no packets are marked. When the av /( erage queue size is greater than the maximum threshold every arriving packet is marked If marked packets are in As discussed in Section 7, this ensures that the gateway fact dropped, or if all source nodes are cooperative, this does not wait too long before marking a packet. ensures that the average queue size does not significantly The gateway marks each packet that arrives at the gate exceed the maximum threshold way when the average queue size aug exceeds matth When the average queue size is between the mini One option for the red gateway is to measure the queue in bytes rather than in packets. With this option
gateways such as Drop Tail and Random Drop gateways have a bias against bursty traffic. With Drop Tail gateways, the more bursty the traffic from a particular connection, the more likely it is that the gateway queue will overflow when packets from that connection arrive at the gateway [7]. Another goal in deciding which connections to notify of congestion is to avoid the global synchronization that results from notifying all connections to reduce their windows at the same time. Global synchronization has been studied in networks with Drop Tail gateways [37], and results in loss of throughput in the network. Synchronization as a general network phenomena has been explored in [8]. In order to avoid problemssuch as biases against bursty traffic and global synchronization, congestion avoidance gateways can use distinct algorithms for congestion detection and for deciding which connections to notify of this congestion. The RED gateway uses randomization in choosing which arriving packetsto mark; with this method, the probability of marking a packet from a particular connection is roughly proportional to that connection’s share of the bandwidth through the gateway. This method can be efficiently implemented without maintaining per-connection state at the gateway. One goal for a congestion avoidance gateway is the ability to control the average queue size even in the absence of cooperating sources. This can be done if the gateway drops arriving packets when the average queue size exceeds some maximum threshold (rather than setting a bit in the packet header). This method could be used to control the average queue size even if most connections last less than a roundtrip time (as could occur with modified transport protocols in increasingly highspeed networks), and even if connections fail to reduce their throughput in response to marked or dropped packets. 4 The RED algorithm This section describes the algorithm for RED gateways. The RED gateway calculates the average queue size, using a low-pass filter with an exponential weighted moving average. The average queue size is compared to two thresholds, a minimum threshold and a maximum threshold. When the average queue size is less than the minimum threshold, no packets are marked. When the average queue size is greater than the maximum threshold, every arriving packet is marked. If marked packets are in fact dropped, or if all source nodes are cooperative, this ensures that the average queue size does not significantly exceed the maximum threshold. When the average queue size is between the minimum and the maximum threshold, each arriving packet is marked with probability ✂☎✄ , where ✂☎✄ is a function of the average queue size ✆✞✝✠✟. Each time that a packet is marked, the probability that a packet is marked from a particular connection is roughly proportional to that connection’s share of the bandwidth at the gateway. The general RED gateway algorithm is given in Figure 1. for each packet arrival calculate the average queue size ✆✞✝✡✟ if ☛✌☞✎✍✑✏✓✒✕✔✖✆✞✝✠✟✘✗✙☛✚✆✞✛✜✏✓✒ calculate probability ✂✢✄ with probability ✂✢✄ : mark the arriving packet else if ☛✌✆✞✛✣✏✓✒✕✔✙✆✞✝✠✟ mark the arriving packet Figure 1: General algorithm for RED gateways. Thus the RED gateway has two separate algorithms. The algorithm for computing the average queue size determines the degree of burstiness that will be allowed in the gateway queue. The algorithm for calculating the packetmarking probability determines how frequently the gateway marks packets, given the current level of congestion. The goalisfor the gateway to mark packets at fairly evenlyspaced intervals, in order to avoid biases and to avoid global synchronization, and to mark packets sufficiently frequently to control the average queue size. The detailed algorithm for the RED gateway is given in Figure 2. Section 11 discusses efficient implementations of these algorithms. The gateway’s calculations of the average queue size take into account the period when the queue is empty (the idle period) by estimating the number ☛ of small packets that could have been transmitted by the gateway during the idle period. After the idle period the gateway computes the average queue size as if ☛ packets had arrived to an empty queue during that period. As ✆✞✝✡✟ variesfrom ☛✤☞✥✍✏✓✒ to ☛✌✆✞✛✏✓✒ ,the packet-marking probability ✂✢✦ varies linearly from 0 to ☛✌✆✞✛★✧: ✂✦✪✩ ☛✌✆✞✛✧★✫ ✆✞✝✠✟✭✬✮☛✤☞✥✍✑✏✓✒✠✯✱✰ ✫☛✌✆✞✛✣✏✓✒✲✬✮☛✤☞✥✍✑✏✓✒✞✯✴✳ The final packet-marking probability ✂✢✄ increases slowly as the count increases since the last marked packet: ✂✄ ✩ ✂☎✦ ✰ ✫✶✵ ✬✮✷✹✸✻✺✣✍✑✼✾✽ ✂☎✦ ✯ As discussed in Section 7, this ensures that the gateway does not wait too long before marking a packet. The gateway marks each packetthat arrives atthe gateway when the average queue size ✆✞✝✠✟ exceeds ☛✚✆✠✛✣✏✓✒ . One option for the RED gateway is to measure the queue in bytes rather than in packets. With this option, 5
Initialization pa←pb/(1- count·p) count←-1 In this case a large FTP packet is more likely to be marked for each packet arrival than is a small TELNET pac calculat avg. queue size a2g Sections 6 and 7 discuss in detail the setting of the var- if the queue is nonempty ious parameters for RED gateways. Section 6 discusses aug f(1-wg)aug+w the calculation of the average queue size. The queue weight wg is determined by the size and duration of bursts in m←- f(time- g-time) queue size that are allowed at the gateway. The mini- mum and maximum thresholds minth and matth are de- if minti≤aug<mart termined by the desired average queue size. The average rement count queue size which makes the desired tradeoffs( such as the calculate probability pa tradeoff between maximizing throughput and minimizing P delay depends on network characteristics, and is left as map(aug-minth/(matth -minth)a question for further research. Section 7 discusses the pb with probability pa: In this paper our primary interest is in th mark the arriving packet operation of the RED gateways. Specific questions about count←0 the most efficient implementation of the RED algorithm Ise if ma th < aug ng packet else count t-1 5 A simple simulation when queue becomes empty a-time←time This section describes our simulator and presents a simple simulation with RED gateways Our simulator is a version of the REAL simulator [ 19]built on Columbia's ulation package [1 with extensive modifications and bug g-time: start of the queue idle time fixes made by Steven McCanne at LBL. In the simula ackets since last marked pkt. tor, FTP sources always have a packet to send and always send a maximal-sized(1000-byte) packet as soon as the Fixed parameters: congestion control window allows them to do so. A sink wg: queue weight immediately sends an aCk packet when it receives a data minth: minimum threshold for queue packet. The gateways use FIFO queueing nath: maximum threshold for queue Source and sink nodes implement a congestion con maximum value for pb trol algorithm equivalent to that in 4.3-Tahoe BSD TCP Briefly, there are two phases to the window-adjustment al Other: gorithm. A threshold is set initially to half the receiver's Pa: current pkt-marking probability advertised window. In the slow-start phase, the current g: current queue size window is doubled each roundtrip time until the window current time reaches the threshold. Then the congestion-avoidance phase a linear function of the time t is entered, and the current window is increased by roughly one packet each roundtrip time. The window is never al- Figure 2: Detailed algorithm for RED gateways lowed to increase to more than the receiver 's advertised window, which this paper refers to as the maximum win dow size". In 4.3-Tahoe BSD TCP, packet loss(a dropped the average queue size accurately reflects the average de- packet)is treated as a "congestion experienced signal lay at the gateway. When this option is used, the algo- The source reacts to a packet loss by setting the threshold rithm would be modified to ensure that the probability that to half the current window, decreasing the current window packet is marked is proportional to the packet size in to one packet, and entering the slow-start phase inth/( p, Packet Size/MaximumPacket Size or does not use the 4.3-Tahoe TCP code directly but we 6
Initialization: ✆✞✝✡✟ ✩❀✿ ✷❁✸✻✺☎✍✑✼ ✩ ✬ ✵ for each packet arrival calculate new avg. queue size ✆✞✝✡✟: if the queue is nonempty ✆✠✝✠✟ ✩ ✫✶✵ ✬❃❂❅❄❆✯❇✆✞✝✠✟❉❈❊❂❅❄●❋ else ☛ ✩❀❍ ✫ ✼■☞✥☛✚❏❅✬❑❋ ✼■☞✥☛✌❏▲✯ ✆✠✝✠✟ ✩ ✫✶✵ ✬❃❂❅❄❆✯❇▼◆✆✞✝✡✟ if ☛✤☞✥✍✑✏✓✒❖✔P✆✞✝✡✟◗✗✙☛✌✆✞✛✣✏✓✒ increment ✷✹✸✻✺✣✍✑✼ calculate probability ✂✄ : ✂☎✦ ✩ ☛✌✆✞✛❘✧ ✫ ✆✞✝✠✟✭✬✮☛✤☞✥✍✏✓✒ ✯✱✰ ✫☛✌✆✞✛✏✓✒ ✬✮☛✤☞✥✍✏✓✒ ✯ ✂✄ ✩ ✂☎✦ ✰ ✫✶✵ ✬✮✷❁✸✻✺☎✍✑✼✾✽ ✂☎✦ ✯ with probability ✂✄ : mark the arriving packet ✷❁✸✻✺✣✍✑✼ ✩❙✿ else if ☛✌✆✞✛✜✏✓✒✕✔✖✆✞✝✠✟ mark the arriving packet ✷❁✸✻✺☎✍✑✼ ✩❀✿ else ✷❁✸✻✺☎✍✑✼ ✩ ✬ ✵ when queue becomes empty ❋ ✼■☞✥☛✌❏ ✩ ✼■☞✥☛✌❏ Saved Variables: ✆✞✝✡✟: average queue size ❋ ✼■☞✥☛✌❏: start of the queue idle time ✷❁✸✻✺☎✍✑✼: packets since last marked pkt. Fixed parameters: ❂❄ : queue weight ☛✤☞✥✍✏✓✒ : minimum threshold for queue ☛✌✆✞✛✏✓✒ : maximum threshold for queue ☛✌✆✞✛★✧ : maximum value for ✂☎✦ Other: ✂☎✄ : current pkt-marking probability ❋: current queue size ✼■☞✥☛✌❏: current time ❍ ✫ ✼✶✯ : a linear function of the time ✼ Figure 2: Detailed algorithm for RED gateways. the average queue size accurately reflects the average delay at the gateway. When this option is used, the algorithm would be modified to ensure thatthe probability that a packet is marked is proportional to the packet size in bytes: ✂✢✦ ✩ ☛✚✆✞✛❘✧ ✫ ✆✞✝✠✟❚✬❃☛✌☞✥✍✏✓✒ ✯❯✰ ✫☛✌✆✞✛✏✓✒ ✬❃☛✌☞✥✍✏✓✒ ✯ ✂✦❱✩ ✂✦❳❲●❨✡❩❭❬✠❪❁❫✴❴❛❵❝❜❞❪✰▲❡❨❣❢❘❵❝❤✭✐★❤✕❲●❨✡❩❭❬✡❪✹❫❭❴❛❵❥❜✹❪ ✂✢✄ ✩ ✂✦ ✰ ✫❇✵ ✬❑✷❁✸✻✺☎✍✑✼❦✽ ✂✦ ✯ In this case a largeFTPpacketis more likely to be marked than is a small TELNET packet. Sections 6 and 7 discuss in detailthe setting of the various parameters for RED gateways. Section 6 discusses the calculation of the average queue size. The queue weight ❂❄ is determined by the size and duration of bursts in queue size that are allowed at the gateway. The minimum and maximum thresholds ☛✌☞✥✍✑✏✓✒ and ☛✚✆✠✛✣✏✓✒ are determined by the desired average queue size. The average queue size which makes the desired tradeoffs (such as the tradeoff between maximizing throughput and minimizing delay) depends on network characteristics, and is left as a question for further research. Section 7 discusses the calculation of the packet-marking probability. In this paper our primary interest is in the functional operation of the RED gateways. Specific questions about the most efficient implementation of the RED algorithm are discussed in Section 11. 5 A simple simulation This section describes our simulator and presents a simple simulation with RED gateways. Oursimulator is a version of the REAL simulator [19] built on Columbia’s Nest simulation package [1], with extensive modifications and bug fixes made by Steven McCanne at LBL. In the simulator, FTP sources always have a packet to send and always send a maximal-sized (1000-byte) packet as soon as the congestion control window allows them to do so. A sink immediately sends an ACK packet when it receives a data packet. The gateways use FIFO queueing. Source and sink nodes implement a congestion control algorithm equivalent to that in 4.3-Tahoe BSD TCP. 3 Briefly,there are two phases to the window-adjustment algorithm. A threshold is set initially to half the receiver’s advertised window. In the slow-start phase, the current window is doubled each roundtrip time until the window reachesthe threshold. Then the congestion-avoidance phase is entered, and the current window is increased by roughly one packet each roundtrip time. The window is never allowed to increase to more than the receiver’s advertised window, which this paper refers to as the “maximum window size”. In 4.3-Tahoe BSD TCP, packetloss (a dropped packet) is treated as a “congestion experienced” signal. The source reacts to a packet loss by setting the threshold to half the current window, decreasing the current window to one packet, and entering the slow-start phase. 3Our simulator does not use the 4.3-Tahoe TCP code directly but we believe it is functionally identical. 6
:业1二M 0.0 0.2 04 0.6 1.0 Queue size(solid line) and average queue size( dashed line) // 小wM加MN ※ X 0.0 0.2 04 0.6 0.8 Time Figure 3: A simulation with four FtP connections with staggered start times
Queue size (solid line) and average queue size (dashed line). ❧ Time ♠ Queue♥ 0.0 0.2 0.4 0.6 0.8 1.0 0♦10 30 max-th ♣ min-th q ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . ... . .. . . . . . . . . . . . . . . .. . . ..... . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . .. .. . . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. . . . . . . .. . .. . . ... . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . .. . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . .. . . . . .. . . .. ... . .. . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . .. .. . . . . . . . . .. .. . . . .. . . . . . . . . . . . . . . . . . . . .. . . .. . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. . . . . . . .. . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . .. .. . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . . . . . . . . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . .. .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . . . .. . . . . . . . ... . .. . . . . . . . . . .. . . . .. . .. . .. . . .. . . . . . . .. . . . . . .. . . . . ... . . . . . . . . . . . . . . . . . .. . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .. . . .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . .. . . . . . . . .. . . . . . . . . . . . . .. . . . . .. . .. . . . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . ... . .. . . . . . . . . . . . . . . . . . . . . . .. . .. . Time r Packet Number (Mod 90) for Four Connections s 0.0 0.2 0.4 0.6 0.8 1.0 0s100s200s300s400s Figure 3: A simulation with four FTP connections with staggered start times. 7
Figure 3 shows a simple simulation with RED gate- in response to a dynamically changing load. As the num- ways. The network is shown in Figure 4. The simula- ber of connections increases, the frequency with which the tion contains four FTPconnections, each with a maximum gateway drops packets also increases. There is no global window roughly equal to the delay-bandwidth product, synchronization. The higher throughput for the connec which ranges from 33 to 112 packets. The RED gateway tions with shorter roundtrip times is due to the bias of parameters are set as follows: wg =0.002, minth= TCP's window increase algorithm in favor of connections 5 packets, matth =15 packets, and maTp 1/50. with shorter roundtrip times(as discussed in[6, 71). For The buffer size is sufficiently large that packets are never the simulation in Figure 3 the average link utilization is dropped at the gateway due to buffer overflow, in this sim- 76%. For the following second of the simulation, when ulation the red gateway controls the average queue size, all four sources are active, the average link utilization is and the actual queue size never exceeds forty packets. 82%.(This is not shown in Figure 3.) FTP SOURCES GATEWAY (tnangle' for RED, square for Drop Tan Figure 5: Comparing Drop Tail and RED gateways SINK FTP SOURCES Figure 4: Simulation network For the charts in Figure 3 the x-axis shows the time in seconds. The bottom chart shows the packets from nodes GATEWAY -4. Each of the four main rows shows the packets from one of the four connections the bottom row shows node 1 20ms packets, and the top row shows node 4 packets. There is a mark for each data packet as it arrives at the gateway and as it departs from the gateway; at this time scale, the two marks are often indistinguishable. The y-axis is a function of the packet sequence number; for packet number n from Figure 6: Simulation network node i, the y-axis shows n mod 90+(i-1)100. Thus, each vertical" line represents 90 consecutively-numbered Because RED gateways can control the average queue packets from one connection arriving at the gateway. Each size while accommodating transient congestion, RED gate- X shows a packet dropped by the gateway, and each'X ways are well-suited to provide high throughput and low is followed by a mark showing the retransmitted packet. average delay in high-speed networks with TCP connec- Node I starts sending packets at time 0, node 2 starts af- tions that have large windows. The RED gateway can ac- ter 0.2 seconds, node 3 starts after 0. 4 seconds, and node 4 commodate the short burst in the queue required by TCP's slow-start phase, thus RED gateways control the aver- size q and the calculated average queue size aug. The dot- smoothly open their windows. Figure 5 shows the results ted lines show minth and mat th, the minimum and max- of simulations of the network in Figure 6 with two TCP imum thresholds for the average queue size. Note that the connections, each with a maximum window of 240 pack- calculated average queue size aug changes fairly slowly ets, roughly equal to the delay-bandwidth product. The compared to g. The bottom row of X's on the bottom two connections are started at slightly different times. The chart shows again the time of each dropped packet. simulations compare the performance of Drop Tail and of This simulation shows the success of the REd gate- RED gateways way in controlling the average queue size at the gateway In Figure 5 the x-axis shows the total throughput as a
Figure 3 shows a simple simulation with RED gateways. The network is shown in Figure 4. The simulation containsfourFTPconnections, each with a maximum window roughly equal to the delay-bandwidth product, which ranges from 33 to 112 packets. The RED gateway parameters are set as follows: ❂❅❄✖t ✿ ✳ ✿✠✿✠✉, ☛✤☞✥✍✑✏✓✒✈t ✇ packets, ☛✌✆✞✛✜✏✓✒①t ✵ ✇ packets, and ☛✚✆✞✛✧ t ✵ ✰ ✇ ✿ . The buffer size is sufficiently large that packets are never dropped atthe gateway due to buffer overflow;in this simulation the RED gateway controls the average queue size, and the actual queue size never exceeds forty packets. 3② 1 2③ 4④ SINK ⑤ GATEWAY ⑥ 1 5⑦ 6⑧ 4④ 45Mbps ⑨ 100Mbps ⑩ 2ms ❶ FTP SOURCES 1ms ⑩ 4ms ⑨ 8ms ❷ 5ms ❸ Figure 4: Simulation network. For the charts in Figure 3, the x-axis shows the time in seconds. The bottom chart shows the packets from nodes 1-4. Each of the four main rows shows the packets from one of the four connections; the bottom row shows node 1 packets, and the top row shows node 4 packets. There is a mark for each data packet as it arrives at the gateway and as it departs from the gateway; at this time scale, the two marks are often indistinguishable. The y-axis is a function of the packet sequence number; for packet number ✍ from node ☞ , the y-axis shows ✍ ❤❺❹❘❻✭❼✠✿ ❈ ✫ ☞❽✬ ✵ ✯ ✵ ✿✡✿. Thus, each vertical ‘line’ represents 90 consecutively-numbered packetsfrom one connection arriving atthe gateway. Each ‘X’ shows a packet dropped by the gateway, and each ‘X’ is followed by a mark showing the retransmitted packet. Node 1 starts sending packets at time 0, node 2 starts after 0.2 seconds, node 3 starts after 0.4 seconds, and node 4 starts after 0.6 seconds. The top chart ofFigure 3 showsthe instantaneous queue size ❋ and the calculated average queue size ✆✠✝✠✟. The dotted lines show ☛✤☞✥✍✑✏✓✒ and ☛✚✆✠✛✣✏✓✒ , the minimum and maximum thresholds for the average queue size. Note that the calculated average queue size ✆✞✝✡✟ changes fairly slowly compared to ❋. The bottom row of X’s on the bottom chart shows again the time of each dropped packet. This simulation shows the success of the RED gateway in controlling the average queue size at the gateway in response to a dynamically changing load. As the number of connectionsincreases,the frequency with which the gateway drops packets also increases. There is no global synchronization. The higher throughput for the connections with shorter roundtrip times is due to the bias of TCP’s window increase algorithm in favor of connections with shorter roundtrip times (as discussed in [6, 7]). For the simulation in Figure 3 the average link utilization is 76%. For the following second of the simulation, when all four sources are active, the average link utilization is 82%. (This is not shown in Figure 3.) (‘triangle’ for RED, ‘square’ for Drop Tail) ❾ Throughput (%) ❿ Average Queue ➀ 0.4 0.6 0.8 1.0 0➁20 40 60 80 100➁ Figure 5: Comparing Drop Tail and RED gateways. 3➂ 4➃ SINK ➄ GATEWAY ➅ 5➆ 6➇ 4➃ 45Mbps ➈ 100Mbps FTP SOURCES 20ms ➉ 1ms Figure 6: Simulation network. Because RED gateways can control the average queue size while accommodating transient congestion, RED gateways are well-suited to provide high throughput and low average delay in high-speed networks with TCP connections that have large windows. The RED gateway can accommodate the short burstin the queue required by TCP’s slow-start phase; thus RED gateways control the average queue size while still allowing TCP connections to smoothly open their windows. Figure 5 shows the results of simulations of the network in Figure 6 with two TCP connections, each with a maximum window of 240 packets, roughly equal to the delay-bandwidth product. The two connections are started at slightly differenttimes. The simulations compare the performance of Drop Tail and of RED gateways. In Figure 5 the x-axis shows the total throughput as a 8
fraction of the maximum possible throughput on the con- The weight wg determines the time constant of the gested link. The y-axis shows the average queue size in low-pass filter. The following sections discuss upper and packets(as seen by arriving packets). Five 5-second sim- lower bounds for setting wg. The calculation of the aver ulations were run for each of ll sets of parameters for age queue size can be implemented particularly efficiently Drop Tail gateways, and for 1 I sets of parameters for red when wg is a(negative)power of two, as shown in Section of these five-second simulations. The simulations with Drop Tail gateways were run with the buffer size ranging 6.1 An upper bound for wq from 15 to 140 packets; as the buffer size is increased, the throughput and the average queue size increase corre If wg is too large, then the averaging procedure will not spondingly. In order to avoid phase effects in the simu- filter out transient congestion at the gateway lations with Drop Tail gateways, the source node takes a Assume that the queue is initially empty, with an aver- random time drawn from the uniform distribution on[0, t] age queue size of zero, and then the queue increases from seconds to prepare an FTP packet for transmission, where 0 to L packets over L packet arrivals. After the Lth packet t is the bottleneck service time of 0. 17 ms [71 rives at the gateway, the average queue size aug, is The simulations with RED gateways were all run with a buffer size of 100 packets, with minth ranging from 3 to 50 packets. For the REd gateways, matth is set agl ∑i(1-mn)- to 3 minth, with wg =0.002 and maTp=1/ 50. The dashed lines show the average delay(as a function of through- out)approximated by 1.73/(1-a) for the simulations n1-2)2(-m with RED gateways, and approximated by 0. 1/(1-a) for the simulations with Drop Tail gateways. For this L+1+ simple network with TCP connections with large win- dows, the network power( the ratio of throughput to de lay) is higher with red gateways than with Drop Tail This derivation uses the following identity [9, p. 65 gateways. There are several reasons for this difference +(Lx-L-1)al+ with Drop Tail gateways with a small maximum queue, he queue drops packets while the TCP connection is in he slow-start phase of rapidly increasing its window, re- ducing throughput. On the other hand, with Drop Tail gateways with a large maximum queue the average delay is unacceptably large. In addition, Drop Tail gateways are more likely to drop packets from both connections at the ne, resulting in global synchronization and a fu her loss of throughput Later in the paper, we discuss simulation results from networks with a more diverse range of connections. The RED gateway is not specifically designed for a network dominated by bulk data transfer; this is simply an easy way to simulate increasingly-heavy congestion at a gate 6 Calculating the average queue length Figure 7: aug as a function of wg and The RED gateway uses a low-pass filter to calculate the Figure 7 shows the average queue size aug for a average queue size. Thus, the short-term increases in the range of values for wa and L. the z-axis shows wo from queue size that result from bursty traffic or from transient 0.001 to 0.005, and the y-axis shows L from 10 to 100 congestion do not result in a significant increase in the For example, for wg =0.001, after a queue increase from average queue sIze 0 to 100 packets, the average queue size aug100 iS 4.88 The low-pass filter is an exponential weighted moving packets average(EWMA) Given a minimum threshold minth, and given that at the (1-g) ish to allow bursts of L packets
fraction of the maximum possible throughput on the congested link. The y-axis shows the average queue size in packets (as seen by arriving packets). Five 5-second simulations were run for each of 11 sets of parameters for Drop Tail gateways, and for 11 sets of parametersfor RED gateways; each mark in Figure 5 shows the results of one of these five-second simulations. The simulations with Drop Tail gateways were run with the buffer size ranging from 15 to 140 packets; as the buffer size is increased, the throughput and the average queue size increase correspondingly. In order to avoid phase effects in the simulations with Drop Tail gateways, the source node takes a random time drawn from the uniform distribution on [0, t] seconds to prepare an FTP packet for transmission, where ✼ is the bottleneck service time of 0.17 ms. [7]. The simulations with RED gateways were all run with a buffer size of 100 packets, with ☛✤☞✥✍✑✏✓✒ ranging from 3 to 50 packets. For the RED gateways, ☛✌✆✞✛✜✏✓✒ is set to ➊●☛✤☞✥✍✑✏✓✒ , with ❂❅❄➋t ✿ ✳ ✿✡✿✠✉ and ☛✚✆✠✛✧ t ✵ ✰ ✇ ✿ . The dashed linesshow the average delay (as a function of throughput) approximated by ✵ ✳➍➌▲➊❛✰ ✫✶✵ ✬➎✛✢✯ for the simulations with RED gateways, and approximated by ✿ ✳ ✵ ✰ ✫❇✵ ✬✙✛✢✯❇➏ for the simulations with Drop Tail gateways. For this simple network with TCP connections with large windows, the network power (the ratio of throughput to delay) is higher with RED gateways than with Drop Tail gateways. There are several reasons for this difference. With Drop Tail gateways with a small maximum queue, the queue drops packets while the TCP connection is in the slow-start phase of rapidly increasing its window, reducing throughput. On the other hand, with Drop Tail gateways with a large maximum queue the average delay is unacceptably large. In addition, Drop Tail gateways are more likely to drop packets from both connections at the same time, resulting in global synchronization and a further loss of throughput. Later in the paper, we discuss simulation results from networks with a more diverse range of connections. The RED gateway is not specifically designed for a network dominated by bulk data transfer; this is simply an easy way to simulate increasingly-heavy congestion at a gateway. 6 Calculating the average queue length The RED gateway uses a low-pass filter to calculate the average queue size. Thus, the short-term increases in the queue size that result from bursty traffic or from transient congestion do not result in a significant increase in the average queue size. The low-passfilter is an exponential weighted moving average (EWMA): ✆✞✝✡✟ ✩ ✫✶✵ ✬✮❂❄ ✯❇✆✞✝✠✟❉❈❊❂❄ ❋❛✳ (1) The weight ❂❅❄ determines the time constant of the low-pass filter. The following sections discuss upper and lower bounds for setting ❂❅❄ . The calculation of the average queue size can be implemented particularly efficiently when ❂❅❄ is a (negative) power of two, as shown inSection 11. 6.1 An upper bound for ➐✕➑ If ❂❄ is too large, then the averaging procedure will not filter out transient congestion at the gateway. Assume thatthe queue is initially empty, with an average queue size of zero, and then the queue increases from 0 to ➒ packets over ➒ packet arrivals. After the ➒th packet arrives at the gateway, the average queue size ✆✞✝✡✟✡➓ is ✆✞✝✡✟➓ t ➔ ➓ →➣❦↔ ☞✞❂❄ ✫❇✵ ✬✮❂❄ ✯ ➓✑↕ → t ❂❄ ✫✶✵ ✬❃❂❄ ✯ ➓ ➔ ➓ →➣❦↔ ☞ ✫ ✵ ✵ ✬❃❂❅❄ ✯ → t ➒➙❈ ✵ ❈ ✫✶✵ ✬❃❂❄ ✯ ➓★➛ ↔ ✬ ✵ ❂❅❄ ✳ (2) This derivation uses the following identity [9, p. 65]: ➔ ➓ →➣❦↔ ☞✥✛ → t ✛❖❈ ✫✥➜❳➝ ✬✮➒✮✬ ✵ ✯❛✛➓★➛ ↔ ✫✶✵ ✬❃✛✢✯✶➞ ✳ 0.001 0.002 0.003 0.004 0.005 w_q 0 25 50 75 100 L 0 5 10 15 20 ave_L 0.001 0.002 0.003 0.004 0.005 w_q 0 25 50 75 100 L 0 5 10 15 20 ave_L Figure 7: ✆✠✝✠✟✡➓ as a function of ❂❅❄ and ➒. Figure 7 shows the average queue size ✆✞✝✠✟➓ for a range of values for ❂❄ and ➒. The ✛-axis shows ❂❄ from 0.001 to 0.005, and the ➟-axis shows ➒ from 10 to 100. For example, for ❂❄ t ✿ ✳ ✿✡✿ ✵ , after a queue increase from 0 to 100 packets, the average queue size ✆✞✝✠✟↔■➠❯➠ is 4.88 packets. Given a minimum threshold ☛✌☞✥✍✑✏✓✒ , and given that we wish to allow bursts of ➒ packets arriving at the gateway, 9
then wg should be chosen to satisfy the following equation we compare two methods for calculating the final packet for aug minth marking probability, and demonstrate the advantages of the second method. In the first method when the aver L+1+ (1-n)2+1 1/pb 7 Calculating the packet-marking prob-Figure shows an experiment comparing the two meth ability ods for marking packets. The top line shows Method I, where each packet is marked with probability p, for The initial packet-marking probability p is calculated as p=0.02. The bottom line shows Method 2, where each a linear function of the average queue size. In this section packet is marked with probability p/(1+ip), for p=0.01
then ❂❅❄ should be chosen to satisfy the following equation for ✆✠✝✠✟✡➓➋✗✖☛✌☞✎✍✑✏✓✒ : ➒➙❈ ✵ ❈ ✫❇✵ ✬✮❂➡❄❞✯ ➓★➛ ↔ ✬ ✵ ❂❄ ✗✖☛✤☞✥✍✑✏✓✒★✳ (3) Given ☛✤☞✥✍✑✏✓✒◗t ✇ , and ➒➢t ✇ ✿ , for example, it is necessary to choose ❂❅❄◆✔ ✿ ✳ ✿✡✿❣➤❛✉. 6.2 A lower bound for ➐✕➑ RED gateways are designed to keep the calculated average queue size ✆✞✝✡✟ below a certain threshold. However, this serves little purpose if the calculated average ✆✞✝✠✟ is not a reasonable reflection of the current average queue size. If ❂❅❄ is set too low,then ✆✠✝✠✟ responds too slowly to changes in the actual queue size. In this case,the gateway is unable to detect the initial stages of congestion. Assume that the queue changes from empty to one packet, and that, as packets arrive and depart at the same rate, the queue remains at one packet. Further assume that initially the average queue size was zero. In this case it takes ✬ ✵ ✰✻➥❥➦ ✫❇✵ ✬✮❂❄ ✯ packet arrivals (with the queue size remaining at one) until the average queue size ✆✞✝✠✟ reachs ✿ ✳ ➧✡➊✮t ✵ ✬ ✵ ✰❣❏ [35]. For ❂❄ t ✿ ✳ ✿✡✿ ✵ , this takes 1000 packet arrivals; for ❂❄ t ✿ ✳ ✿✡✿✞✉, this takes 500 packet arrivals; for ❂❅❄❺t ✿ ✳ ✿✠✿➊, this takes 333 packet arrivals. In most of our simulations we use ❂❅❄❅t ✿ ✳ ✿✡✿✠✉. 6.3 Setting ➨➫➩✱➭➲➯➵➳ and ➨➎➸✢➺●➯➵➳ The optimal values for ☛✤☞✥✍✏✓✒ and ☛✌✆✞✛✏✓✒ depend on the desired average queue size. If the typical traffic is fairly bursty, then ☛✤☞✥✍✑✏✓✒ must be correspondingly large to allow the link utilization to be maintained at an acceptably high level. For the typical traffic in our simulations, for connections with reasonably large delay-bandwidth products, a minimum threshold of one packet would result in unacceptably low link utilization. The discussion of the optimal average queue size for a particular traffic mix is left as a question for future research. The optimal value for ☛✚✆✠✛✏✓✒ depends in part on the maximum average delay that can be allowed by the gateway. The RED gateway functions most effectively when ☛✚✆✠✛✏✓✒ ✬➻☛✤☞✥✍✏✓✒ is larger than the typical increase in the calculated average queue size in one roundtrip time. A useful rule-of-thumb isto set ☛✚✆✠✛✣✏✓✒ to atleasttwice ☛✌☞✥✍✑✏✓✒ . 7 Calculating the packet-marking probability The initial packet-marking probability ✂✦ is calculated as a linear function of the average queue size. In this section we compare two methods for calculating the final packetmarking probability, and demonstrate the advantages of the second method. In the first method, when the average queue size is constant the number of arriving packets between marked packets is a geometric random variable; in the second method the number of arriving packets between marked packets is a uniform random variable. The initial packet-marking probability is computed as follows: ✂✦✪✩ ☛✌✆✞✛✧★✫ ✆✞✝✠✟✭✬✮☛✤☞✥✍✑✏✓✒✠✯✱✰ ✫☛✌✆✞✛✣✏✓✒✲✬✮☛✤☞✥✍✑✏✓✒✞✯✴✳ The parameter ☛✚✆✞✛✧ gives the maximum value for the packet-marking probability ✂☎✦ , achieved when the average queue size reaches the maximum threshold. Method1: Geometric random variables. In Method 1, let each packet be marked with probability ✂☎✦ . Let the intermarking time ➼ be the number of packets that arrive, after a marked packet, until the next packet is marked. Because each packet is marked with probability ✂✦ , ➽❉➾ ✸❣➚▲➪➼➶t➫✍✢➹✢t ✫✶✵ ✬ ✂✦ ✯✶➘ ↕ ↔ ✂✦ ✳ Thus with Method 1, ➼ is a geometric random variable with parameter ✂✦ , and ➴✘➪➼➷➹☎t ✵ ✰✂✦ . With a constant average queue size, the goalis to mark packets at fairly regular intervals. It is undesirable to have too many marked packets close together, and it is also undesirable to have too long an interval between marked packets. Both of these events can result in global synchronization, with several connectionsreducing their windows at the same time, and both of these events can occur when ➼ is a geometric random variable. ➬ Method 2: Uniform random variables. A more desirable alternative is for ➼ to be a uniform random variable from ➮ 1, 2, ..., ✵ ✰✂☎✦✴➱ (assuming for simplicity that ✵ ✰✂✦ is an integer). This is achieved if the marking probability for each arriving packet is ✂✦ ✰ ✫❇✵ ✬✃✷❁✸✻✺☎✍✑✼❅✽ ✂✦ ✯ , where ✷❁✸✻✺✣✍✑✼ is the number of unmarked packets that have arrived since the last marked packet. Call this Method 2. In this case, ➽❉➾✸❣➚❣➪➼➶tP✍✢➹❐t ✂✦ ✵ ✬ ✫✍✌✬ ✵ ✯✂✦ ➘ ↕❒➞ →➣❳➠ ❮ ✵ ✬ ✂✦ ✵ ✬❃☞ ✂✦✡❰ t ✂✦✭Ï➵❹✡Ð ✵ ✔✖✍❃✔ ✵ ✰✂✦✴Ñ and ➽❉➾ ✸❣➚▲➪➼Òt➎✍✢➹✢t ✿➙Ï➵❹✠Ð ✍➙Ó ✵ ✰✂✦ ✳ For Method 2, ➴✘➪➼➋➹✢t ✵ ✰ ✫ ✉✂✦ ✯❳❈ ✵ ✰ ✉ . ➬ Figure 8 shows an experiment comparing the two methods for marking packets. The top line shows Method 1, where each packet is marked with probability ✂ , for ✂ t ✿ ✳ ✿✞✉. The bottom line shows Method 2, where each packetis marked with probability ✂ ✰ ✫✶✵ ❈❉☞✂ ✯ ,for ✂ t ✿ ✳ ✿ ✵ , 10