This article has been accepted for inclusion in a future issue of this joural.Content is final as presented,with the exception of pagination IEEE/ACM TRANSACTIONS ON NETWORKING Achieving 100%Throughput in TCP/AQM Under Aggressive Packet Marking With Small Buffer Do Young Eun,Member,IEEE,and Xinbing Wang,Member;IEEE Abstract-We consider a TCP/AQM system with large link ca- should change its transmission rate or window size in response pacity(NC)shared by many flows.The traditional rule-of-thumb to the congestion signal(packet drop,delay,or Explicit Conges- suggests that the buffer size be chosen in proportion to the number tion Notification (ECN)marks [1])from the network. of flows (N)for full link utilization,while recent research out- comes show that O(VN)buffer sizing is sufficient for high utiliza- As the number of flows and the size of link capacity in the net- tion and O(1)buffer sizing makes the system stable at the cost of work continue to grow,the analysis and design of such a large reduced link utilization.In this paper,we consider a system where TCP/AQM system are becoming increasingly difficult and in- the Active Queue Management (AOM)is scaled as O(N)with volved.Among other issues,the question of buffer sizing has a buffer of size O(NB)(0<a<B 0.5).By capturing recently received much attention in the literature [2-8.The randomness both in packet arrivals and in packet markings,we develop a doubly-stochastic model for a TCP/AOM system with question is:given the number of TCP flows(N)and the size of many flows.We prove that,under such a scale,the system always link capacity (NC),how do we choose the buffer size B(N)? performs well in the sense that the link utilization goes to 100% Under the linear buffer sizing B(N)=O(N),1 it is well known and the loss ratio decreases to zero as the system size N increases. that a"stable"system ensures the convergence of a normalized Our results assert that the system enjoys benefit of largeness with queue-length to a nonzero value(nonzero queueing delay)and no tradeoff between full link utilization,zero packet loss,and small buffer size,at least asymptotically.This is in stark contrast to ex- thus achieves 100%link utilization [9],[3],[10],[11],while isting results showing that there always exists a tradeoff between the square-root buffer sizing (B(N)=O(VN)turns out to full link utilization and the required buffer size.Extensive ns-2 be sufficient to achieve reasonably high link utilization when simulation results under various configurations also confirm our N is large [2].More recently,it has been suggested that a very theoretical findings.Our study illustrates that blind application of small buffer of size (B(N)=O(1))would be enough and in fluid modeling may result in strange results and exemplifies the importance of choosing a right modeling approach for different fact make the system stable at the cost of reduced link utiliza- scaling regimes tion [11],[5],[6],[8].Clearly,different objectives give rise to different guidelines on buffer sizing,and there exists a tradeoff Index Terms-Router buffer sizing,small buffer,stochastic mod- eling,transmission control protocol. between full link utilization and the amount of required buffer space. On the other hand,for the design of TCP/AOM schemes as- I.INTRODUCTION sociated with the chosen buffer size,the "fluid modeling"of TCP/AQM congestion control and its stability analysis have proven extremely powerful and versatile.Still,different scaling N THE current Internet,TCP congestion control is respon- regimes for AQM schemes and buffer sizes often lead to dif- sible for carrying about 90%of the bytes of total traffic gen- ferent types of fluid models.For example,when there are N erated in the network.Such a congestion control algorithm con- flows with capacity NC and linear scaling (O(N))is employed sists of a network and a source algorithm that are tightly coupled for the buffer size and AQM (e.g.,all the buffer thresholds for with each other.The network algorithm,or the Active Queue packet marking are O(N)),it has been shown that the system Management (AQM)usually acting on a router,detects an onset dynamics can be described by the normalized version of the of congestion and governs how to generate and update con- gestion signal based on the information collected at the router, system [12],[13]via the law-of-large-numbers type of argu- ments.In this case,several stability criteria have been obtained e.g.,input traffic rate,queue size,delay,etc.,and then notify in terms of parameters of the normalized system [14],[15],[10], the sender,hoping that the sender will react appropriately to where the packet marking is based on the normalized queue- help alleviate the congestion and to attain best network perfor- mance.On the other hand,the source algorithm that operates length and the stability of the system refers to the convergence of this normalized queue-length in the steady-state (thus resulting at the edge of the network (end-user).dictates how each sender in 100%link utilization).In contrast,when the buffer size and the scaling for AQM are both chosen as O(1),it turns out that Manuscript received November 22,2005:revised August 28.2006,and April the system is better modeled by explicitly capturing the random 3,2007:approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor R. Srikant.This work was supported in part by the National Science Foundation packet arrivals within each RTT (rate-based AQMs)[16],[11], through CAREER Award CNS-0545893. [5],[17],where the packet marking is based on the normalized D.Y.Eun is with the Department of Electrical and Computer Engineering, rate into the queue and the stability of the system now refers North Carolina State University.Raleigh,NC 27695-7911 USA (e-mail: dyeun@ncsu.edu). to the convergence of this normalized rate (which is less than X.Wang is with the Department of Electrical Engineering,Shanghai Jiaotong University,Shanghai 200240,China (e-mail:xwang8@sjtu.edu.cn). IThroughout this paper,we write f(z)=O(g(z))when 0< f() Digital Object Identifier 10.1109/TNET.2007.904000 imif。e品≤i sp-x语<o. 1063-6692/$25.00©2008EEE
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE/ACM TRANSACTIONS ON NETWORKING 1 Achieving 100% Throughput in TCP/AQM Under Aggressive Packet Marking With Small Buffer Do Young Eun, Member, IEEE, and Xinbing Wang, Member, IEEE Abstract—We consider a TCP/AQM system with large link capacity ( ) shared by many flows. The traditional rule-of-thumb suggests that the buffer size be chosen in proportion to the number of flows ( ) for full link utilization, while recent research outcomes show that ( ) buffer sizing is sufficient for high utilization and (1) buffer sizing makes the system stable at the cost of reduced link utilization. In this paper, we consider a system where the Active Queue Management (AQM) is scaled as ( ) with a buffer of size ( ) (0 0 5). By capturing randomness both in packet arrivals and in packet markings, we develop a doubly-stochastic model for a TCP/AQM system with many flows. We prove that, under such a scale, the system always performs well in the sense that the link utilization goes to 100% and the loss ratio decreases to zero as the system size increases. Our results assert that the system enjoys benefit of largeness with no tradeoff between full link utilization, zero packet loss, and small buffer size, at least asymptotically. This is in stark contrast to existing results showing that there always exists a tradeoff between full link utilization and the required buffer size. Extensive -2 simulation results under various configurations also confirm our theoretical findings. Our study illustrates that blind application of fluid modeling may result in strange results and exemplifies the importance of choosing a right modeling approach for different scaling regimes. Index Terms—Router buffer sizing, small buffer, stochastic modeling, transmission control protocol. I. INTRODUCTION I N THE current Internet, TCP congestion control is responsible for carrying about 90% of the bytes of total traffic generated in the network. Such a congestion control algorithm consists of a network and a source algorithm that are tightly coupled with each other. The network algorithm, or the Active Queue Management (AQM) usually acting on a router, detects an onset of congestion and governs how to generate and update congestion signal based on the information collected at the router, e.g., input traffic rate, queue size, delay, etc., and then notify the sender, hoping that the sender will react appropriately to help alleviate the congestion and to attain best network performance. On the other hand, the source algorithm that operates at the edge of the network (end-user), dictates how each sender Manuscript received November 22, 2005; revised August 28, 2006, and April 3, 2007; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor R. Srikant. This work was supported in part by the National Science Foundation through CAREER Award CNS-0545893. D. Y. Eun is with the Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911 USA (e-mail: dyeun@ncsu.edu). X. Wang is with the Department of Electrical Engineering, Shanghai Jiaotong University, Shanghai 200240, China (e-mail: xwang8@sjtu.edu.cn). Digital Object Identifier 10.1109/TNET.2007.904000 should change its transmission rate or window size in response to the congestion signal (packet drop, delay, or Explicit Congestion Notification (ECN) marks [1]) from the network. As the number of flows and the size of link capacity in the network continue to grow, the analysis and design of such a large TCP/AQM system are becoming increasingly difficult and involved. Among other issues, the question of buffer sizing has recently received much attention in the literature [2]–[8]. The question is: given the number of TCP flows and the size of link capacity , how do we choose the buffer size ? Under the linear buffer sizing ,1 it is well known that a “stable” system ensures the convergence of a normalized queue-length to a nonzero value (nonzero queueing delay) and thus achieves 100% link utilization [9], [3], [10], [11], while the square-root buffer sizing turns out to be sufficient to achieve reasonably high link utilization when is large [2]. More recently, it has been suggested that a very small buffer of size would be enough and in fact make the system stable at the cost of reduced link utilization [11], [5], [6], [8]. Clearly, different objectives give rise to different guidelines on buffer sizing, and there exists a tradeoff between full link utilization and the amount of required buffer space. On the other hand, for the design of TCP/AQM schemes associated with the chosen buffer size, the “fluid modeling” of TCP/AQM congestion control and its stability analysis have proven extremely powerful and versatile. Still, different scaling regimes for AQM schemes and buffer sizes often lead to different types of fluid models. For example, when there are flows with capacity NC and linear scaling is employed for the buffer size and AQM (e.g., all the buffer thresholds for packet marking are , it has been shown that the system dynamics can be described by the normalized version of the system [12], [13] via the law-of-large-numbers type of arguments. In this case, several stability criteria have been obtained in terms of parameters of the normalized system [14], [15], [10], where the packet marking is based on the normalized queuelength and the stability of the system refers to the convergence of this normalized queue-length in the steady-state (thus resulting in 100% link utilization). In contrast, when the buffer size and the scaling for AQM are both chosen as , it turns out that the system is better modeled by explicitly capturing the random packet arrivals within each RTT (rate-based AQMs) [16], [11], [5], [17], where the packet marking is based on the normalized rate into the queue and the stability of the system now refers to the convergence of this normalized rate (which is less than 1Throughout this paper, we write f(x) = O(g(x)) when 0 < lim inf lim sup < 1. 1063-6692/$25.00 © 2008 IEEE
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination 2 IEEE/ACM TRANSACTIONS ON NETWORKING the normalized capacity,thus leading to a reduced link utiliza-some function p,where the superscript N in the marking func- tion).An"intermediate"scaling of O(N)with 00(see Fig.1).Note that this can details.In Section IV,we prove our main results showing that be interpreted as a more aggressive marking than the case of all the performance metrics are not impaired under the proposed a=3=1 (linear scale)since the marking probability is much scale asymptotically.In Section V,we provide extensive ns-2 higher for the same queue-length.2 We note that difference in simulations to verify our findings.In Section VI,we illustrate marking scales simply means different parameter setup at the the importance of capturing the randomness in packet arrivals routers,which can be easily configured as desired.Moreover, under the aggressive scale and discuss existing results for the the scaling of marking functions,or AQM schemes in general, proposed scale.Finally,we conclude in Section VII. is directly related to the issues of network design over a long time scale.For example,if the number of connections (or cus- IⅡL.PRELIMINARIES tomers)were to increase by ten times,then a natural question to ask is:what percentages of additional capacities or buffer sizes A.Scaling Buffers and AOMs Inside a Network are needed at routers to provide the same level of performance to each user?Under the linear scale (=B=1),we would In TCP congestion control,one of the long-held rules-of- have to increase by 10 times the buffer size as well as all the thumb in network design is that the buffer size at bottleneck links should be proportional to the bandwidth-delay product [9]. parameters associated with the AQM.If our aggressive scale is [3].In a large network setting where there are N flows with to be used,the buffer size and the AQM parameters should be increased very little (e.g.,100.3 2 and 100.21.6 when link capacity NC,this rule-of-thumb suggests O(N)for buffer a=0.2and3=0.3. sizing.Associated with this buffer sizing is the linear scaling for AQM,in which the marking function p()[0,1],or Fig.I shows some examples of marking functions with the the probability of packet marking when the queue-length is r, scale of Na.Note that all the thresholds for the marking are on is scaled linearly,i.e.,for all N and p (Nx)=p(x)for 2Our scale is even more aggressive than the case of a =0.5
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2 IEEE/ACM TRANSACTIONS ON NETWORKING the normalized capacity, thus leading to a reduced link utilization). An “intermediate” scaling of with for AQMs and the buffer size was also considered in [5]. However, the attention was paid mainly to the stability of the system, as opposed to the performance in terms of the link utilization and the queue-length distribution in the steady-state. In this paper, we focus on TCP/AQM systems with ECN marking under a “sub-square-root” scaling and investigate the system performance in terms of the link utilization and the queue-length distribution in the steady-state. Specifically, we scale the AQM and the buffer size as with (more “aggressive” than and analyze the system behavior as becomes large. By explicitly taking into account the randomness both in packet arrival time and the number of packets itself, we develop a “doubly-stochastic” Markovian model in which the aggregate arrival to the queue is modeled by a conditional Poisson process—a Poisson process with stochastic rate where the stochastic rate is determined by the dynamics of random packet markings in the previous RTT. We then analyze this doubly-stochastic model under the proposed scale in the steady-state and show that, as increases: 1) the link utilization converges to 100% and 2) the queue-length distribution is concentrated right on “target,” leading to zero loss probability. Thus our results show that the system enjoys “benefit of largeness” under scale in that there is no tradeoff between 100% link utilization, zero loss probability at the queue, and smaller buffer size, at least asymptotically. This is in sharp contrast to other existing results in the literature, which all predict some tradeoffs between the buffer size and the link utilization. We also provide extensive simulation results under various settings to confirm all our theoretical findings. The rest of the paper is organized as follows. In Section II, we first provide background on the issue of buffer sizing at routers and propose our scales for AQM and the buffer sizing. We also illustrate the difficulty of finding a “right” fluid model for the system under the aggressive scale. In Section III, we develop a doubly-stochastic framework to capture the randomness both in packet marking and time of arrivals and describe our modeling details. In Section IV, we prove our main results showing that all the performance metrics are not impaired under the proposed scale asymptotically. In Section V, we provide extensive -2 simulations to verify our findings. In Section VI, we illustrate the importance of capturing the randomness in packet arrivals under the aggressive scale and discuss existing results for the proposed scale. Finally, we conclude in Section VII. II. PRELIMINARIES A. Scaling Buffers and AQMs Inside a Network In TCP congestion control, one of the long-held rules-ofthumb in network design is that the buffer size at bottleneck links should be proportional to the bandwidth-delay product [9], [3]. In a large network setting where there are flows with link capacity , this rule-of-thumb suggests for buffer sizing. Associated with this buffer sizing is the linear scaling for AQM, in which the marking function , or the probability of packet marking when the queue-length is , is scaled linearly, i.e., for all and for some function , where the superscript in the marking function means that there are flows with capacity NC [18], [10], [13], [19], [15]. Note that the linear scaling means that packets are marked based on the normalized queue-length. Under the drop-tail AQM policy with many flows, it is shown [2] that one can do much better than this traditional rule-of-thumb for buffer sizing. Empirical observation reveals that when is large, each user’s window size becomes independent. By appealing to the Central Limit Theorem, their sum will behave like a Gaussian random variable with mean equal to the size of the “pipe” and with standard deviation . Hence, by choosing the buffer size on the order of to absorb typical fluctuations, one can still achieve high utilization and thus save huge cost for buffers implemented in high-speed core routers. Further, quite recently, far smaller buffer sizing of has been proposed in [5], [6], [8], all of which however result in a reduced link utilization (bounded away from 1). This observation leads us to pose the following question: can we do better than , i.e., can we put buffers of size with instead, while maintaining full link utilization and low packet loss? The only available result in the literature using this range of scale is in [5], where it was suggested that the system under scale would promote instability for very large (thus undesirable). Still, it lacks any performance investigation in terms of some possible tradeoff between the link utilization and the queue-length behavior under such a scale. B. Aggressive Packet Marking In this paper, we explore any possibility of using buffer sizes of and AQM schemes with scaling, where . By this scaling of TCP/AQM, we mean that the buffer size is on the order of and, for AQM schemes, there exists a function such that for all and (see Fig. 1). Note that this can be interpreted as a more aggressive marking than the case of (linear scale) since the marking probability is much higher for the same queue-length.2 We note that difference in marking scales simply means different parameter setup at the routers, which can be easily configured as desired. Moreover, the scaling of marking functions, or AQM schemes in general, is directly related to the issues of network design over a long time scale. For example, if the number of connections (or customers) were to increase by ten times, then a natural question to ask is: what percentages of additional capacities or buffer sizes are needed at routers to provide the same level of performance to each user? Under the linear scale , we would have to increase by 10 times the buffer size as well as all the parameters associated with the AQM. If our aggressive scale is to be used, the buffer size and the AQM parameters should be increased very little (e.g., and when and ). Fig. 1 shows some examples of marking functions with the scale of . Note that all the thresholds for the marking are on 2Our scale is even more aggressive than the case of = 0:5
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination EUN AND WANG:ACHIEVING 100%THROUGHPUT IN TCP/AOM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER Marking probability Marking probability queue-length [10],[15].[20,[211.Similarly,in 131,19]the Steady State Region Steady State Region authors derived a limiting system dynamics by considering 1 random packet markings under O()scale (but ignoring random packet arrivals within each RTT).On the other hand, under O(1)scale with constant buffer size,the resulting fluid models are usually of rate-based types,where the marking is 9maN“BN) based on the arrival rate (e.g.,an incoming packet is marked YN“ B(N) Queue size Qucue size with probability (A/C)B where packets arrive randomly with an average rate A and buffer size B)and the stability now (a) (b) refers to the convergence of the arrival rate,from which the Fig.1.Examples of marking functions with aggressive marking scale of No. steady-state queue-length distribution can be inferred [16],[11], (a)Linear marking:RED type.(b)Exponential marking:REM type. [5.Further,in 171,[22],[23],where all the system parameters are kept constant(not dependent on N),some appropriate fluid models were obtained by sending the 'weight'of the exponen- the order of N.Suppose that the system performs as desired in tial averaging for the queue-length to zero(or making it very the steady-state,3 i.e.,the queue-length will be O(N)with high small)and by making the exponentially averaged queue-length probability.Then,we expect that the system will achieve almost almost constant (separation of time-scales). 100%utilization while packet loss probability with a buffer of In this regard,it is now clear what went wrong in the afore- size O(N)(B>a)will be very small large N.Our ns-2 sim- mentioned arguments.Specifically,the stability criterion in [10] ulation results show that this is indeed the case(see Section V is not meant to be used for systems with O(N)scale,as it for details.)Since0CT+1)be the maximum window size for all flows,i.e.,1<WN()<Wmax fluid models for the normalized arrival rate and queue-length, for all i and N.Then,each window size WN(:)evolves as where the marking is based on the normalized queue-length and follows: the stability here refers to the convergence of the normalized 3Here,the steady-state should be interpreted in a distributional sense,not as W(k+1) (WN(k)+1)Aumax,if no marking a constant queue-length. W(k)/2V1, otherwise
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. EUN AND WANG: ACHIEVING 100% THROUGHPUT IN TCP/AQM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 3 Fig. 1. Examples of marking functions with aggressive marking scale of N . (a) Linear marking: RED type. (b) Exponential marking: REM type. the order of . Suppose that the system performs as desired in the steady-state,3 i.e., the queue-length will be with high probability. Then, we expect that the system will achieve almost 100% utilization while packet loss probability with a buffer of size will be very small large . Our -2 simulation results show that this is indeed the case (see Section V for details.) Since , this implies that we can in fact do much better than buffer sizing [2] while maintaining all the good performance measures (full link utilization, negligible delay, and low loss). C. Deriving Fluid Models Under Aggressive Scales Suppose now that one wants to apply the existing stability criterion in the literature to the system under the aggressive scale. For example, according to [10], [15], [18], the criterion for linear stability of the system with queue-based marking (obtained from the Nyquist criterion) becomes , where is a system-independent constant. On the other hand, the stability of all rate-based marking systems assuming Poisson type of packet arrivals requires that the slope of the marking function at the equilibrium be bounded, which then translates into the condition that the buffer size be bounded, i.e., or at least scaling should be employed for AQM [16], [11]. Note that for any and large , none of these criteria is satisfied, and this leads to a conclusion that the system under the aggressive scale is always unstable, albeit all the good performance measures as numerically observed. This is an embarrassing situation since an ‘unstable’system still possesses all the good performance measures. So, one may ask: “What went wrong?,” “What causes this self-conflicting conclusion?” The above illustrates that one has to be very careful in choosing a “right” fluid model (or approximation) of the system under consideration. In general, depending on the scale of the system or types of limiting regimes employed, one can derive different fluid models (or approximations) that effectively capture the dynamics of original stochastic systems and then discuss their stability. For example, under the linear scale for AQM and the buffer size, one can readily obtain fluid models for the normalized arrival rate and queue-length, where the marking is based on the normalized queue-length and the stability here refers to the convergence of the normalized 3Here, the steady-state should be interpreted in a distributional sense, not as a constant queue-length. queue-length [10], [15], [20], [21]. Similarly, in [13], [19] the authors derived a limiting system dynamics by considering random packet markings under scale (but ignoring random packet arrivals within each RTT). On the other hand, under scale with constant buffer size, the resulting fluid models are usually of rate-based types, where the marking is based on the arrival rate (e.g., an incoming packet is marked with probability where packets arrive randomly with an average rate and buffer size B) and the stability now refers to the convergence of the arrival rate, from which the steady-state queue-length distribution can be inferred [16], [11], [5]. Further, in [17], [22], [23], where all the system parameters are kept constant (not dependent on ), some appropriate fluid models were obtained by sending the ‘weight’ of the exponential averaging for the queue-length to zero (or making it very small) and by making the exponentially averaged queue-length almost constant (separation of time-scales). In this regard, it is now clear what went wrong in the aforementioned arguments. Specifically, the stability criterion in [10] is not meant to be used for systems with scale, as it was originally derived for a system with scale. Similarly, those in [17], [16], [11] are valid only under scale (or under a rate-based marking). Certainly, blind application of the fluid model could lead to very strange results. For the system with scale, however, it is still unclear how to derive a “right” fluid model. This scale regime was discussed in [5], where the authors considered “underload” and “overload” situations separately and claimed that the system would be stable for moderate values of , but tends to be unstable when is very large (say, 5000 or more). However, this seems to conflict with the observation made in [2] and also our theoretical and simulation results later on in this paper showing that the system performance gets better for larger . (See more discussion on this in Section VI-B.) As an attempt to address this “grey area”, in this paper, we instead construct a fully stochastic model by taking the packet-level dynamics into account and analyze its performance in the steady-state, thus bypassing the difficulty of deriving a “right” fluid model for the system under the proposed aggressive scale. III. SYSTEM MODEL A. A Doubly Stochastic Approach In this section, we develop our doubly-stochastic approach to capture the randomness both in packet arrival instants and random packet markings. Consider a single link shared by persistent flows, whose window sizes evolve according to the AIMD rule. Let be the round-trip-time delay (RTT) for all flows. We define to be the window size of flow at the start of th RTT, where the superscript means that there are flows with capacity NC. Let be the maximum window size for all flows, i.e., for all and . Then, each window size evolves as follows:
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination IEEE/ACM TRANSACTIONS ON NETWORKING Random marking Dst 1 conditional Poisson process with rate (w+w.W)/T Sre Dst 2 random ILI O(N) Dst N sre2 111 I11 packet NC (k) Random Arrival with arrivals Parameter:w(k) B(N) Fig.3.Model for the arrival process to the queue within each RTT. WN(k+1) 25 小 -RED1 RED1 Fig.2.Overview on our doubly stochastic approach. 20 -4-EXP EXP -RED2 10 RED2 10 where xAy :=minfr,y}and r Vy :=maxfz,y.We 5 define a set of window sizes for all N flows by WN():= (W(),W5(),..,WR(k) 00 500 1000 1500 00 500 10001500 The number of flows,N The number of flows,N Fig.2 illustrates our doubly-stochastic approach.Sup- 0.98 0.06 pose first that,at the start of the kth RTT,all the window RED1 sizes for N flows are given.i.e.,WN()=wN()where 0.96 0.05 EXP RED2 wN():=(wi(),...,w()).(Note here that we use 0.94 0.04 capital letters for random variables and lower case letters for 0.92 0.03 ◆-ARD1 their realizations.)Then each source i simply transmits w() 0.9 0.02 RED2 number of packets over each RTT onto the network at the 0.B8 500 1000 1500 0.01 500 10001500 source side.These packets traverse a number of links(upstream The number of flows,N The number of flows,N queues)being multiplexed with other flows,and suffer different amount of delay jitters.Thus,by the time they arrive to the Fig.4.Heterogeneous RTTs:Performance metrics with increasing number of flows (N)for fixed a =0.2. queue of interest deep inside the network,their arrival times become random.In other words,given WN(k)=wN(). the arrival to the queue of interest becomes a random process 0.06 RED1,a-0.2 RED2.002 with a set of parameters as a function of wN(),making the 0.98 月ED1.=03 0.05 4HED2=03 queue-length fluctuation also random. 0.96 Now,for a given realization of the queue-length fluctuation, 0.04 0.94 we can find the distribution on whether incoming packets are 0.03 marked or not from a specific marking function p(z).Thus,by 0.92 RED1,-0.2 taking expectation over all possible queue-length realizations, 0.9 RED2, 0.02 RED1,a=0.3 we can calculate the distribution on whether there is any marked 4·RED2,c=0.3 0.88 0.01 packet for flow 2,which in turn determines the distribution of 500 1000 1500 0 50010001500 the window size for (+1)th RTT duration,i.e.,WN(+1). The number of flows,N The number of flows,N Therefore,given WN()=wN(),it is possible to find the Fig.5.Heterogeneous RTTs:Performance comparison of REDI and RED2 distribution of WN(:+1)by capturing all the dynamics men- with increasing number of flows (N)for different o(o =0.2 and 0.3). tioned above,and as a consequence,there exists a Markovian structure between WN(:)and WN(:+1). under WN()=wN(),the Poisson assumption for the aggre- B.Model Description gate arrivals to the queue within one RTT is reasonable as long as packet arrivals from each flow are jittered independently at In this section,we describe our model in detail and construct a different upstream queues,due to the fact that the superposition Markov chain for WN().As before,at the start of the kth RTT, of independent point processes under a suitable scaling con- suppose that all the window sizes are known.Then,in order to verges weakly to a Poisson process [26],[27].These Poisson capture the random nature in packet arrivals,given W()= packet arrivals over small time scale(sub-second scale or RTT- w()fori=1,2,...,N,we assume that the actual aggregate level)have also been empirically observed in [28]-[31] packet arrivals to the queue within one RTT can be modeled by Remark:Note that a conditional Poisson process is not a a Poisson process with mean rate AN()given by Poisson process since the rate over one RTT is given by the sum of the random window sizes.In other words.the rate itself is a v(): (1) random process,making the arrival process doubly-stochastic. Similar doubly-stochastic approaches for TCP/AQM systems have been used in [171,[111.[8].However,results in [17],[11] See Fig.3 for illustration.Thus,the arrival process to the are not applicable to systems under the aggressive scale,and in queue of interest is modeled by a conditional Poisson process [8],the conditional Poisson process is used only to show that the (or doubly stochastic Poisson process)[24],[25].We note that,process is dominated by a standard Poisson process with some
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4 IEEE/ACM TRANSACTIONS ON NETWORKING Fig. 2. Overview on our doubly stochastic approach. where and . We define a set of window sizes for all flows by . Fig. 2 illustrates our doubly-stochastic approach. Suppose first that, at the start of the RTT, all the window sizes for flows are given, i.e., where . (Note here that we use capital letters for random variables and lower case letters for their realizations.) Then each source simply transmits number of packets over each RTT onto the network at the source side. These packets traverse a number of links (upstream queues) being multiplexed with other flows, and suffer different amount of delay jitters. Thus, by the time they arrive to the queue of interest deep inside the network, their arrival times become random. In other words, given , the arrival to the queue of interest becomes a random process with a set of parameters as a function of , making the queue-length fluctuation also random. Now, for a given realization of the queue-length fluctuation, we can find the distribution on whether incoming packets are marked or not from a specific marking function . Thus, by taking expectation over all possible queue-length realizations, we can calculate the distribution on whether there is any marked packet for flow , which in turn determines the distribution of the window size for th RTT duration, i.e., . Therefore, given , it is possible to find the distribution of by capturing all the dynamics mentioned above, and as a consequence, there exists a Markovian structure between and . B. Model Description In this section, we describe our model in detail and construct a Markov chain for . As before, at the start of the th RTT, suppose that all the window sizes are known. Then, in order to capture the random nature in packet arrivals, given for , we assume that the actual aggregate packet arrivals to the queue within one RTT can be modeled by a Poisson process with mean rate given by (1) See Fig. 3 for illustration. Thus, the arrival process to the queue of interest is modeled by a conditional Poisson process (or doubly stochastic Poisson process) [24], [25]. We note that, Fig. 3. Model for the arrival process to the queue within each RTT. Fig. 4. Heterogeneous RTTs: Performance metrics with increasing number of flows (N) for fixed = 0:2. Fig. 5. Heterogeneous RTTs: Performance comparison of RED1 and RED2 with increasing number of flows (N) for different ( = 0:2 and 0.3). under , the Poisson assumption for the aggregate arrivals to the queue within one RTT is reasonable as long as packet arrivals from each flow are jittered independently at different upstream queues, due to the fact that the superposition of independent point processes under a suitable scaling converges weakly to a Poisson process [26], [27]. These Poisson packet arrivals over small time scale (sub-second scale or RTTlevel) have also been empirically observed in [28]–[31] Remark: Note that a conditional Poisson process is not a Poisson process since the rate over one RTT is given by the sum of the random window sizes. In other words, the rate itself is a random process, making the arrival process doubly-stochastic. Similar doubly-stochastic approaches for TCP/AQM systems have been used in [17], [11], [8]. However, results in [17], [11] are not applicable to systems under the aggressive scale, and in [8], the conditional Poisson process is used only to show that the process is dominated by a standard Poisson process with some
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination EUN AND WANG:ACHIEVING 100%THROUGHPUT IN TCP/AOM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 5 10 Note that this assumption is very general and the base marking function p(z)can be any non-decreasing function with p(0)=0 h(N)=N,100%Correlation and p(oo)=1.See Fig.I for examples satisfying the above 10 assumptions. Within one RTT and given WN(k)=wN(k),since we as- 10 Ref. sume that the arrival is a Poisson process with mean rate AN(), RED1 EXP we can find the distribution of the queue-length during the kth -RED2 RTT by using any standard queueing model such as M/D/1 h(N)=1,Independent or M/G/1 as long as An()NC,all the incoming packets are marked.4 Now, 10 given WN(k)=wN (k)during kth RTT.let (be the random variable denoting the queue-length fluctuation due to the random arrival instants.From our formulation above,we see 10 0 200 400 600 800 1000 that)is distributed according to the steady-state queue- length distribution of the M/D/1 or M/G/1 if An(k)NC,we can con- (a) veniently set =o since p()-1 from (3)and all 0.035 the incoming packets will be marked.Also,this implies that Gaussian distribution (1491,48.52) Marginal distribution of total window size ∑W吟(阁PNC,since all the flows will back off in the next RTT.the window sizes at the next RTT will not larger constant rate(in terms of the maximum window size of be independent.However,we will show that this event becomes each flow)and all the analysis was conducted via this standard rare for large Nin the sense that∑W'(≥VCT}一 Poisson process for the system under O(1)scale with constant 0 as N increases (see Lemma 1).Hence,for any case,we can buffer size.Instead,we focus on different scalings and attempt assume the following: to solve the constructed Markov chain for the doubly-stochastic Assumption 1:Given WN(k)=wN(k),the random vari- model (random arrivals and random packet markings)in the ables WN(+1)(i=1,2,...,N)are independent. steady-state with stationary distribution and derive the asymp- Indeed,our simulation results in Section V.C will also show totic performance results. that the window sizes for different flows are mostly independent As before,pN()here is the probability that a packet is even under our aggressive marking scale(see Fig.6).Further, marked when the queue-length is t.In particular,our marking Assumption 1 will be used only to construct a Markov chain for function p(z)of scale a E(0,1/2)will satisfy the following: the system and will never be used again once the system enters there exists a non-decreasing function p :R[0,1]such a steady-state. that,for all N and x Now,for any given N.[WN()>o forms an N-dimen- sional homogeneous Markov chain with its state space given by p(NOr)=p(x) (2) E:={1,2....,wmaxNns(N) (5) where This corresponds to an unstable queue with utilizationp>1.Thus,for large ()=0 and lim p()=1. (3) N,the queue-length will explode for any marking scale No (0<a<1/2) within that RTT and almost all packets will be marked
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. EUN AND WANG: ACHIEVING 100% THROUGHPUT IN TCP/AQM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 5 Fig. 6. (a) h(N) = Varf W g= VarfW g as a function of N for different AQM schemes under = 0:2; (b) Histogram for aggregate window sizes from N flows under RED1, N = 200, and = 0:2. (a) h(N). (b) PDF of aggregate window sizes. larger constant rate (in terms of the maximum window size of each flow) and all the analysis was conducted via this standard Poisson process for the system under scale with constant buffer size. Instead, we focus on different scalings and attempt to solve the constructed Markov chain for the doubly-stochastic model (random arrivals and random packet markings) in the steady-state with stationary distribution and derive the asymptotic performance results. As before, here is the probability that a packet is marked when the queue-length is . In particular, our marking function of scale will satisfy the following: there exists a non-decreasing function such that, for all and (2) where (3) Note that this assumption is very general and the base marking function can be any non-decreasing function with and . See Fig. 1 for examples satisfying the above assumptions. Within one RTT and given , since we assume that the arrival is a Poisson process with mean rate , we can find the distribution of the queue-length during the th RTT by using any standard queueing model such as or as long as . Further, we assume that if , all the incoming packets are marked.4 Now, given during th RTT, let be the random variable denoting the queue-length fluctuation due to the random arrival instants. From our formulation above, we see that is distributed according to the steady-state queuelength distribution of the or if where is from (1). When , we can conveniently set since from (3) and all the incoming packets will be marked. Also, this implies that all the time, since all the flows will drop their rates by half once the sum of the window sizes goes beyond . Now, given and given a realization of queue-length, each packet will be marked with probability . Since flow transmits packets, we see that flow receives no mark during the th RTT with probability (4) where is the expectation over all possible realizations of the queue-length under . Given , suppose first . Then the queue remains stable within that RTT (i.e., utilization ), and each packet will be marked independently of any other. We thus expect that the events of flow receiving at least one marked packet are likely to be independent for different , which implies independence of window sizes among different flows at the next RTT. In the case of , since all the flows will back off in the next RTT, the window sizes at the next RTT will not be independent. However, we will show that this event becomes rare for large in the sense that as increases (see Lemma 1). Hence, for any case, we can assume the following: Assumption 1: Given , the random variables are independent. Indeed, our simulation results in Section V.C will also show that the window sizes for different flows are mostly independent even under our aggressive marking scale (see Fig. 6). Further, Assumption 1 will be used only to construct a Markov chain for the system and will never be used again once the system enters a steady-state. Now, for any given , forms an N-dimensional homogeneous Markov chain with its state space given by (5) 4This corresponds to an unstable queue with utilization 1. Thus, for large N, the queue-length will explode for any marking scale N (0 << 1=2) within that RTT and almost all packets will be marked
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination. 6 IEEE/ACM TRANSACTIONS ON NETWORKING where as N increases,without explicitly solving for the stationary dis- tribution. S(N)- From(9),since(1-p())w.is bounded,we know that the (unconditional)probability of flow i not marked during the kth (6) RTT also converges to the following steady-state probability of flow 2 not marked: From Assumption 1,the transition probability is given by (10) P{(k+1)=(k+1)W()=N( =Πr{wk+)=uk+1 where the expectation is taken both over WN and random queue-length fluctuation.Specifically,we can calculate (10)as =D产(} (7) follows: where -{{-j"》 RW(k:+1)=(k+1)W()=w()} The following result will be used in proving our main results. Due to space limitation,we only provide an outline of the proof pi,N,ky if wN (k+1)=(wi(k)+1AWmax 1-P,N,, ifw(k+1)=()/2V in this paper.Please see our technical report [34]for complete proofs. 10, otherwise Proposition 1:For any given N>0 and i(10.(2 space given by (5)and (6),it is not difficult to see that the chain is irreducible and aperiodic.Further,for any given N,the state Sketch of the Proof:We basically want to show that the ex- space E is finite.Thus,from Theorem 3.3 in [32,p.105],the pression in (10)should not be too close to 0,nor to 1.To see chain is positive recurrent,hence is ergodic.So,there exists a this,suppose that F0 for some large N.Then,for most unique stationary distributionπ,and the process{WW()}k≥o sample paths(realizations),each flow i will have very high prob- converges in variation to T [32],[33].In other words,we have ability of being marked and so most of N flows will decrease their window sizes by half.Thus,the distribution of the window lim sizes for the next RTT duration will be different from the current k+00」 P{W=0}-T{o引=0. one,and this contradicts that the system is in steady-state with a stationary distribution.Similar arguments hold for the case if This proves that,regardless of initial distributions of window it is too close to 1. sizes of N flows,the chain converges to a steady-state in which We define the steady-state utilization of the system by the process WN()has a stationary distribution.In the steady- 1 state,we basically have a steady-state queueing system,where the input process to the queue is a conditional Poisson process PN=(N)=NCTE W+W+..+WN).(13) (which has stationary increments,but not independent incre- As the system is in the steady-state with a stationary arrival ments)and its underlying process WN(k)is a homogeneous, process to the queue(a conditional Poisson process),we should ergodic Markov chain with a stationary distribution As the have p(N<1 for all N,since otherwise the queue-length distribution of WN()does not depend onk:in the steady-state, will increase without bound almost surely and eventually all the we will use WN to denote the window sizes random variables in flows will drop their rates by half,contradicting the fact that the the steady-state and to denote the distribution for WN when-distribution of window sizes does not depend on time. ever there is no ambiguity. In the steady-state,we already observed that the stationary In principle,since we know the transition matrix from(7)and arrival process to the queue of interest becomes the conditional (8),it is theoretically possible to find out the stationary distri-(or doubly stochastic)Poisson process where the underlying bution T by solving a set of balance equations.However,due process is a stationary,ergodic Markov chain.Hence,given to the complicated structure for the transition matrix(especially WN()=wN(),within that RTT duration,the arrival process for large N,direct calculation of the steady-state distribution is to the queue becomes a Poisson process with rate AN()as in computationally prohibitive.Instead,our approach here is to de-(1).In general,under WN()=wN(),this Poisson charac- rive some properties of key performance metrics of the system terization within that RTT enables us to describe the queueing
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6 IEEE/ACM TRANSACTIONS ON NETWORKING where (6) From Assumption 1, the transition probability is given by (7) where (8) and is from (4). In the next section, we will investigate the limiting behaviors of this Markov chain in the steady-state as increases and provide key performance results under the aggressive scale. IV. THEORETICAL RESULTS From the Markov chain defined in (7) and (8) with its state space given by (5) and (6), it is not difficult to see that the chain is irreducible and aperiodic. Further, for any given , the state space is finite. Thus, from Theorem 3.3 in [32, p. 105], the chain is positive recurrent, hence is ergodic. So, there exists a unique stationary distribution , and the process converges in variation to [32], [33]. In other words, we have (9) This proves that, regardless of initial distributions of window sizes of flows, the chain converges to a steady-state in which the process has a stationary distribution . In the steadystate, we basically have a steady-state queueing system, where the input process to the queue is a conditional Poisson process (which has stationary increments, but not independent increments) and its underlying process is a homogeneous, ergodic Markov chain with a stationary distribution . As the distribution of does not depend on in the steady-state, we will use to denote the window sizes random variables in the steady-state and to denote the distribution for whenever there is no ambiguity. In principle, since we know the transition matrix from (7) and (8), it is theoretically possible to find out the stationary distribution by solving a set of balance equations. However, due to the complicated structure for the transition matrix (especially for large ), direct calculation of the steady-state distribution is computationally prohibitive. Instead, our approach here is to derive some properties of key performance metrics of the system as increases, without explicitly solving for the stationary distribution. From (9), since is bounded, we know that the (unconditional) probability of flow not marked during the th RTT also converges to the following steady-state probability of flow not marked: (10) where the expectation is taken both over and random queue-length fluctuation. Specifically, we can calculate (10) as follows: The following result will be used in proving our main results. Due to space limitation, we only provide an outline of the proof in this paper. Please see our technical report [34] for complete proofs. Proposition 1: For any given and , we have (11) Further, there exists a constant (independent of ) such that (12) Sketch of the Proof: We basically want to show that the expression in (10) should not be too close to 0, nor to 1. To see this, suppose that for some large . Then, for most sample paths (realizations), each flow will have very high probability of being marked and so most of flows will decrease their window sizes by half. Thus, the distribution of the window sizes for the next RTT duration will be different from the current one, and this contradicts that the system is in steady-state with a stationary distribution. Similar arguments hold for the case if it is too close to 1. We define the steady-state utilization of the system by (13) As the system is in the steady-state with a stationary arrival process to the queue (a conditional Poisson process), we should have for all , since otherwise the queue-length will increase without bound almost surely and eventually all the flows will drop their rates by half, contradicting the fact that the distribution of window sizes does not depend on time. In the steady-state, we already observed that the stationary arrival process to the queue of interest becomes the conditional (or doubly stochastic) Poisson process where the underlying process is a stationary, ergodic Markov chain. Hence, given , within that RTT duration, the arrival process to the queue becomes a Poisson process with rate as in (1). In general, under , this Poisson characterization within that RTT enables us to describe the queueing
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination EUN AND WANG:ACHIEVING 100%THROUGHPUT IN TCP/AOM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER > dynamics in terms of AN()and any general service time distri-for any n>0.Setn=gNo and note that the above is bounded bution G.However,in order to prevent the analysis from being by intractable and unwieldy,we will assume that the service time distributions of each packet are independent and identically (17) distributed (i.i.d.)and exponentially distributed throughout the rest of the paper.In this way,given WN()=wN(),the queue dynamics can be described by that of M/M/1 queue Now.observe that (17)can be rewritten as within each RTT.5 Let us define X =WN (NCT such that EfXV}=1, ((N) +) (18) and also their scaled sum Y by XY+X+…+XW-N Assumption 2 guarantees that the second term in (18)is finite. (14) VN In other words,there exists a constant D such that 01 for all large N. Remark 2:Assumption 2 is quite technical and requires the Since 13.Then,under Assumption 2,we Our first main result shows that the utilization of the steady- have state system converges to 1 as N increases. Theorem I:Under Assumption 2,we have 21) (W)=1. Sketch of the Proof:First,since the Markov chain is Sketch of the Proof:Since p(N)for tribution (of WN.Suppose that (21)is not true.Then,with all positive,.Then,by choosing-Na and taking strict positive probability,the sum of the window sizes becomes expectation,we can similarly show that there exists a constant at least NCT.Whenever this happens,we know that all the B such that0<B≤P{Qwr≥qNa}for some g∈(0,o∞). flows receive marks,so they all reduce their window sizes by Since the queue dynamics is modeled by M/M/1 under half in the next RTT.Now,observe that once all of them are WN =wN,we can show by using the conditional expectation halved,it will take a certain number of RTTs for the sum of the that window sizes to cross NCT again,as the sum of the window P{Q之n W+W+·+W sizes can increase at best only by N per RTT.During that pe- NCT riod (i.e.,an interval between two consecutive "crossing the (16) NCT"events),the average window size remains smaller than NCT,and this type of behavior will repeat forever.So.by SGiven WN()=N().the queue dynamics can be similarly analyzed via taking time average and using(22),we arrive to the conclusion M/M/1/K or M/D/1/K (with a buffer of size O(N).In order to avoid saying that the steady-state utilization is strictly bounded away unnecessary complicated derivations,however.we use M/I/1 as an approxi- mation.Note also that the loss probability of a finite buffer of size K is bounded from 1,which contradicts the result in Theorem 1 for large N by the overflow probability of an infinite-buffer with level R. Therefore,Lemma 1 follows.See [34]for the complete proof
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. EUN AND WANG: ACHIEVING 100% THROUGHPUT IN TCP/AQM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 7 dynamics in terms of and any general service time distribution . However, in order to prevent the analysis from being intractable and unwieldy, we will assume that the service time distributions of each packet are independent and identically distributed (i.i.d.) and exponentially distributed throughout the rest of the paper. In this way, given , the queue dynamics can be described by that of queue within each RTT.5 Let us define such that , and also their scaled sum by (14) We will then assume the following: Assumption 1: The distribution of for large is wellbehaved in the sense that for any positive sequence with as , we have (15) Remark 2: Assumption 2 is quite technical and requires the moment generating function (with vanishing exponent be simply finite. It basically states that the scaled sum of does not quickly diverge as grows. Indeed, if for are i.i.d. or weakly-dependent, then by the CLT, for large , the distribution of will not be so much different from a Gaussian random variable, for which (15) trivially holds. Moreover, this assumption is readily satisfied in practice as we will show in Section V-C that the window sizes for different flows behave as if they are more or less independent under our aggressive marking scale (see Fig. 6). Our first main result shows that the utilization of the steadystate system converges to 1 as increases. Theorem 1: Under Assumption 2, we have Sketch of the Proof: Since for each , we only have to show that . From (12) and the convexity of for , we can show by using Jensen’s inequality that the packet marking probability is lower-bounded. Specifically, there exists a constant such that . Since is nondecreasing in , we have for all positive . Then, by choosing and taking expectation, we can similarly show that there exists a constant such that for some . Since the queue dynamics is modeled by under , we can show by using the conditional expectation that (16) 5Given W (k) = w (k), the queue dynamics can be similarly analyzed via M=M=1=K or M=D=1=K (with a buffer of size O(N ). In order to avoid unnecessary complicated derivations, however, we use M=M=1 as an approximation. Note also that the loss probability of a finite buffer of size K is bounded by the overflow probability of an infinite-buffer with level K. for any . Set and note that the above is bounded by (17) Now, observe that (17) can be rewritten as (18) Assumption 2 guarantees that the second term in (18) is finite. In other words, there exists a constant such that for all sufficiently large . In other words, by taking logs, we must have (19) which readily implies . This completes the proof. Theorem 1 guarantees that for all large . Since and from Proposition 1, it is straightforward to see that there exist constants such that for all large (20) (also, see proofs in [34]). In other words, the expected marking probability of each packet is bounded away from 0 and 1 uniformly over . Before we proceed to our second main result, we present the following lemma. Lemma 1: Suppose . Then, under Assumption 2, we have (21) Sketch of the Proof: First, since the Markov chain is ergodic, for any bounded function , we have from Theorem 4.1 in [32, p.111] that (22) where is the expectation with respect to the stationary distribution of . Suppose that (21) is not true. Then, with strict positive probability, the sum of the window sizes becomes at least . Whenever this happens, we know that all the flows receive marks, so they all reduce their window sizes by half in the next RTT. Now, observe that once all of them are halved, it will take a certain number of RTTs for the sum of the window sizes to cross again, as the sum of the window sizes can increase at best only by per RTT. During that period (i.e., an interval between two consecutive “crossing the ” events), the average window size remains smaller than , and this type of behavior will repeat forever. So, by taking time average and using (22), we arrive to the conclusion saying that the steady-state utilization is strictly bounded away from 1, which contradicts the result in Theorem 1 for large . Therefore, Lemma 1 follows. See [34] for the complete proof
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination IEEE/ACM TRANSACTIONS ON NETWORKING We now present our second main result,which states that TABLE I the fluctuation of the steady-state queue-length is no more than AQM PARAMETERS FOR NS-2 SIMULATIONS O(Na+e). AQM Parameters Theorem 2:Let Qww be the steady-state queue-length REDI qmin Na 2No,gmar Na 10No random variable.Suppose that CT>3 and Assumption 2 is Pmaz =0.2.Buffer Size B(N)=12N+ satisfied.Then,for any e >0,we have EXP y=-10/In (1 -Pmaz )Buffer Size 12Na+e RED2 qmin No =4No,qmax No =11Na N Pmaz =0.05.Buffer Size B(N)=12N+e a =0 in probability. (23) Sketch of the Proof:Similarly as in the Proof of Theorem all the "congestion signals"are from packet marking,not from 1,we can show from Proposition 1 that there exists a constant packet loss. 7 such that{Qwx≥4Na}≤n>1 for some4∈(0,o), Remark:From the practical point of view,however,the added Then,from(16)withn=dNa and from Lemma 1,it is not margin O(Ne)may grow very slowly.For example,if e=0.1, difficult to see that then even for N =1000 flows,we have Ne 2,and this margin (a factor of 2)may not be large enough to ensure almost zero packet loss.In other words,in Theorem 1,the speed of NCT convergence to zero can be quite slow.But,we emphasize that our results provide theoretical basis and the loss probability can for some constant 77 and for all sufficiently large N.Using the be made indeed arbitrarily small for any given constant e>0 same steps in the last part of the Proof of Theorem 1,we obtain and for all“sufficiently large”N. limsup Na logp(N)0 with a+0 such that Na logp(N)Nate}is bounded by the loss probability is a bit slow as predicted. Throughout the simulations,we consider the following four 《)}宫咬 performance metrics:1)average queueing delay;2)delay jitter 26 (we calculate the standard deviation from all the measured queueing delays);3)link utilization;and 4)packet loss ratio. From Assumption 2 and(25)and by the argument used in(18), A.AOM Configurations the first term in (26)can be bounded by M.exp(-Ne)for some constant M<oo.Further,Lemma 1 asserts that the We consider three different queue-based AQM schemes with second term in (26)also decreases to zero,and we are done. O(N)scales for our simulations:two RED schemes under dif- See [34]for the complete proof. ferent configurations(referred by RED1 and RED2)and EXP as Remarks 3:Relations (19)and (24)tell us how fast p(N) in Fig.1(b).We set all the thresholds in the queue-length to be converges to 1.Specifically,we can rewrite (19)and (24)as proportional to Na,(i.e.,pN(Nar)=p()).Table I summa- logp(N)=O(1/Na),or p(N)exp(-kN-)for some rizes the parameters of each AQM scheme used in our simula- constant 0<K<oo.Note that the speed of convergence de- tion. pends on the scaling parameter a.In particular,it can be quite For RED1,the minimum and the maximum thresholds slow for very small a. (dminNa and gmaxNo in Fig.1(a))are 2 x Na and 10 x No, In Theorem 2,the positive constant e plays a critical role respectively.The maximum dropping probability,Pmax,is set to ensure that Qwy is no larger than O(Nate).From Propo- to 0.2 and the queue averaging factor,q-weight-,is set to 1.(We sition 1 and Theorem 1,we can see that Etp(wx)} have also run simulations with other queue averaging factors, is always bounded away from 0 and 1 (uniformly over e.g.,q-weight_=0.002,and obtained similar results.)As to N).Since the marking function is scaled on the order of EXP,we use a queue-based AQM with an exponential marking Na (Recall p(Nar)=p(x)),this in turns implies that profile as in Fig.1(b)).In order to make a fair comparison with .10 P[OwONa)}is also bounded away from 0 and 1 RED,we choosen)(as in Fig.1(b)such that uniformly over N.Thus,if we set the buffer size to O(Na),the p(maxth)=pN(10N)=Piax.RED2 is similar to RED1. loss probability will not go to zero.Instead,the added"margin" but with different thresholds and Px.In any case,the buffer of length O(Ne)is necessary to make the loss probability go Note that each flow will get approximately the same bandwidth share regard- to zero under the buffer of size O(N+e)and to ensure that less of the size of the system N
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8 IEEE/ACM TRANSACTIONS ON NETWORKING We now present our second main result, which states that the fluctuation of the steady-state queue-length is no more than . Theorem 2: Let be the steady-state queue-length random variable. Suppose that and Assumption 2 is satisfied. Then, for any , we have in probability (23) Sketch of the Proof: Similarly as in the Proof of Theorem 1, we can show from Proposition 1 that there exists a constant such that for some . Then, from (16) with and from Lemma 1, it is not difficult to see that for some constant and for all sufficiently large . Using the same steps in the last part of the Proof of Theorem 1, we obtain (24) Further, since , without loss of generality, we only have to prove (23) for with . Note that, from (24), there exists such that for all sufficiently large . Thus for all sufficiently large (25) Now, as before, from (16) with the choice of , it follows that is bounded by (26) From Assumption 2 and (25) and by the argument used in (18), the first term in (26) can be bounded by for some constant . Further, Lemma 1 asserts that the second term in (26) also decreases to zero, and we are done. See [34] for the complete proof. Remarks 3: Relations (19) and (24) tell us how fast converges to 1. Specifically, we can rewrite (19) and (24) as , or for some constant . Note that the speed of convergence depends on the scaling parameter . In particular, it can be quite slow for very small . In Theorem 2, the positive constant plays a critical role to ensure that is no larger than . From Proposition 1 and Theorem 1, we can see that is always bounded away from 0 and 1 (uniformly over ). Since the marking function is scaled on the order of (Recall , this in turns implies that is also bounded away from 0 and 1 uniformly over . Thus, if we set the buffer size to , the loss probability will not go to zero. Instead, the added “margin” of length is necessary to make the loss probability go to zero under the buffer of size and to ensure that TABLE I AQM PARAMETERS FOR NS-2 SIMULATIONS all the “congestion signals” are from packet marking, not from packet loss. Remark: From the practical point of view, however, the added margin may grow very slowly. For example, if , then even for flows, we have , and this margin (a factor of 2) may not be large enough to ensure almost zero packet loss. In other words, in Theorem 1, the speed of convergence to zero can be quite slow. But, we emphasize that our results provide theoretical basis and the loss probability can be made indeed arbitrarily small for any given constant and for all “sufficiently large” . V. NUMERICAL RESULTS In this section, we provide numerical results using -2 [35] to validate our results in Section IV-C. First, we show that even under our aggressive scale, the window sizes of each of flows tend to be independent or weakly correlated, which is required for our theoretical analysis. Second, we show that as the number of flows and the size of the link capacity increase,6 the queueing delay becomes smaller, the link utilization increases, and the packet loss probability decreases, although the convergence for the loss probability is a bit slow as predicted. Throughout the simulations, we consider the following four performance metrics: 1) average queueing delay; 2) delay jitter (we calculate the standard deviation from all the measured queueing delays); 3) link utilization; and 4) packet loss ratio. A. AQM Configurations We consider three different queue-based AQM schemes with scales for our simulations: two RED schemes under different configurations (referred by RED1 and RED2) and EXP as in Fig. 1(b) . We set all the thresholds in the queue-length to be proportional to , (i.e., . Table I summarizes the parameters of each AQM scheme used in our simulation. For RED1, the minimum and the maximum thresholds and in Fig. 1(a)) are and , respectively. The maximum dropping probability, , is set to 0.2 and the queue averaging factor, q weight , is set to 1. (We have also run simulations with other queue averaging factors, e.g., q weight , and obtained similar results.) As to EXP, we use a queue-based AQM with an exponential marking profile as in Fig. 1(b)). In order to make a fair comparison with RED, we choose (as in Fig. 1(b) such that . RED2 is similar to RED1, but with different thresholds and . In any case, the buffer 6Note that each flow will get approximately the same bandwidth share regardless of the size of the system N
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination EUN AND WANG:ACHIEVING 100%THROUGHPUT IN TCP/AOM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 9 size in our simulations is set to B(N)=12 x Nate with S2-1 Z-N e=0.1,as the margin O(Ne)is required in Theorem 1. Sink 2 Sink 3 B.Simple Dumbbell Topology Router 1 Router 2 Router 3 We first consider a simple dumbbell topology with a single bottleneck link shared by N long-lived TCP flows.We set the AQM:N Sink 2N sum of two-way propagation delays for N flows to be i.i.d.and uniformly distributed over [120,280]ms (with mean 200 ms, Fig.7.Simulation topology:multiple bottleneck links with heterogeneous RTTs. as in [21).Each packet size is fixed to 500 bytes and the link capacity is NC=NX 100 kbps.Thus,the target throughput for each flow is C=100 kbps or 25 packets/sec. From the ns-2 simulation results in Section V-B,we mea- Fig.4 shows the four performance metrics as the number of sured the values h(N)as N varies for all the AQM schemes con- long-lived flows (N)increases under our aggressive scales for AQM schemes with a =0.2 and e=0.1.7 Note that for all sidered,and plot them in Fig.6(a)on a semi-log scale.8 Fig.6(b) shows the corresponding histogram for aggregate window sizes the schemes considered,the queueing delay and delay-jitter de- from all N flows under REDI and o =0.2.As we see.the crease sharply with the increase of N,and the utilization and the functionh(N)mostly stays around 1 for all AQM schemes con- packet loss ratio show clear trend with the number of flows N, sidered over all ranges of N.This implies that there exists vir- as predicted by Theorems 1 and 2.In addition.REDI and EXP tually no positive correlation among the window sizes for dif- yields almost identical performance in all performance metrics, ferent flows even under the aggressive marking (a=0.2).From while we observe some tradeoffs between REDI (or EXP)and RED2.REDI and EXP have lower packet-loss ratio with the Fig.6,the pdf of the total window size is shown to be very close to a Gaussian distribution with mean 1419 and standard devia- smaller link utilization,while RED2 produces higher link uti- tion 48.52 packets.We also measure the function h(N)and the lization with a little larger packet-loss ratio. Fig.5 shows the link utilization and the loss ratio for RED1 histogram under various configurations with N up to 1000 flows and different a e(0,0.5).In all these cases,we observe that and RED2 under a =0.2 and a =0.3.We see that for larger the function h(N)is always around 1 and the histogram closely o,the system yields higher link utilization and smaller loss matches with some Gaussian marginal distribution,suggesting ratio.This is reasonable because larger o implies that the that the window sizes for all flows are more or less independent corresponding buffer size and the margin for packet marking all the time. O(N)is larger.In general,for any given N,it is possible to improve the link utilization at the expense of higher loss ratio D.Multiple Links With Heterogeneous RTTs (or vice versa)by using different AOM configurations.But. under O(Na)scale,all these differences will disappear when Up until now,we have simulated various network scenarios N becomes larger,since the link utilization will approach to 1 with a single link.In this section,we present simulations results and the loss ratio to zero as shown in Section IV. using multiple links in the network. Fig.7 depicts a network topology with multiple links of our consideration.In this scenario,we have three routers and two C.Independence of Window Sizes Among Different Flows groups of flows,where group 1(S1-1 to S1-N)traverses routers In Sections III and V,we have assumed that there exists inde- 1-3 and group 2(S2-1 to S2-N)goes through routers 2 and 3. pendence (or weak dependence)of window sizes random vari- We use drop-tail for routers 1 and 3 with O(N)buffer sizes,and ables among different flows.Those assumptions,in conjunction for router 2,we use different AQM schemes as before with the with random packet arrivals within each RTT,play a crucial role scale of Na.The capacities of routers 1 and 2 are N X 100 and in establishing our main results in Section IV.In this section, 2N X 100 kbps,respectively.The two-way propagation delays we show that the window sizes are indeed almost independent of group 1 flows are uniformly distributed over [110,190]ms under all AQM schemes with our aggressive scaling. with mean 150 ms,and over [60.140]with mean 100 ms for To investigate whether there exists any strong correlations group 2. among WN,i=1,2,...,N,we consider a function h(N)de- As before,Fig.8 shows the similar trends for all the perfor- fined by mance metrics measured at router 2,which demonstrate that our results hold for more general network configurations.Regarding var∑wW the link utilization and packet loss ratio,we see some zigzags h(N)= ∑1ar{Wy when the number of flows is small (up to around 200).For such small values of N,it is still far away from the domain of con- vergence for Theorems 1 and 2.and the performance is mainly If the window sizes are independent for all N,we immediately governed by other factors.For example,the buffer size in Table I have h(N)=1 for all N.If they are all"perfectly"correlated is 12x Nate,and if N is not large enough,the resulting queue (synchronized),we would have h(N)N. 8We sampled the window size of each flow ievery 20 ms except for some 7In our numerical results,we were unable to simulate the case of N larger initial period.Since RTTs are between 120 and 280 ms in our simulation setup. than 1300 flows mainly because of the limitation in the ns-2. we get 6 to 14 samples of the window size of each flow per RTT
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. EUN AND WANG: ACHIEVING 100% THROUGHPUT IN TCP/AQM UNDER AGGRESSIVE PACKET MARKING WITH SMALL BUFFER 9 size in our simulations is set to with , as the margin is required in Theorem 1. B. Simple Dumbbell Topology We first consider a simple dumbbell topology with a single bottleneck link shared by long-lived TCP flows. We set the sum of two-way propagation delays for flows to be i.i.d. and uniformly distributed over [120, 280] ms (with mean 200 ms, as in [2]). Each packet size is fixed to 500 bytes and the link capacity is kbps. Thus, the target throughput for each flow is kbps or 25 packets/sec. Fig. 4 shows the four performance metrics as the number of long-lived flows increases under our aggressive scales for AQM schemes with and .7 Note that for all the schemes considered, the queueing delay and delay-jitter decrease sharply with the increase of , and the utilization and the packet loss ratio show clear trend with the number of flows , as predicted by Theorems 1 and 2. In addition, RED1 and EXP yields almost identical performance in all performance metrics, while we observe some tradeoffs between RED1 (or EXP) and RED2. RED1 and EXP have lower packet-loss ratio with the smaller link utilization, while RED2 produces higher link utilization with a little larger packet-loss ratio. Fig. 5 shows the link utilization and the loss ratio for RED1 and RED2 under and . We see that for larger , the system yields higher link utilization and smaller loss ratio. This is reasonable because larger implies that the corresponding buffer size and the margin for packet marking is larger. In general, for any given , it is possible to improve the link utilization at the expense of higher loss ratio (or vice versa) by using different AQM configurations. But, under scale, all these differences will disappear when becomes larger, since the link utilization will approach to 1 and the loss ratio to zero as shown in Section IV. C. Independence of Window Sizes Among Different Flows In Sections III and V, we have assumed that there exists independence (or weak dependence) of window sizes random variables among different flows. Those assumptions, in conjunction with random packet arrivals within each RTT, play a crucial role in establishing our main results in Section IV. In this section, we show that the window sizes are indeed almost independent under all AQM schemes with our aggressive scaling. To investigate whether there exists any strong correlations among , we consider a function de- fined by If the window sizes are independent for all , we immediately have for all . If they are all “perfectly” correlated (synchronized), we would have . 7In our numerical results, we were unable to simulate the case of N larger than 1300 flows mainly because of the limitation in the ns-2. Fig. 7. Simulation topology: multiple bottleneck links with heterogeneous RTTs. From the -2 simulation results in Section V-B, we measured the values as varies for all the AQM schemes considered, and plot them in Fig. 6(a) on a semi-log scale.8 Fig. 6(b) shows the corresponding histogram for aggregate window sizes from all flows under RED1 and . As we see, the function mostly stays around 1 for all AQM schemes considered over all ranges of . This implies that there exists virtually no positive correlation among the window sizes for different flows even under the aggressive marking . From Fig. 6, the pdf of the total window size is shown to be very close to a Gaussian distribution with mean 1419 and standard deviation 48.52 packets. We also measure the function and the histogram under various configurations with up to 1000 flows and different . In all these cases, we observe that the function is always around 1 and the histogram closely matches with some Gaussian marginal distribution, suggesting that the window sizes for all flows are more or less independent all the time. D. Multiple Links With Heterogeneous RTTs Up until now, we have simulated various network scenarios with a single link. In this section, we present simulations results using multiple links in the network. Fig. 7 depicts a network topology with multiple links of our consideration. In this scenario, we have three routers and two groups of flows, where group 1 (S1–1 to S1–N) traverses routers 1–3 and group 2 (S2–1 to S2–N) goes through routers 2 and 3. We use drop-tail for routers 1 and 3 with buffer sizes, and for router 2, we use different AQM schemes as before with the scale of . The capacities of routers 1 and 2 are and kbps, respectively. The two-way propagation delays of group 1 flows are uniformly distributed over [110, 190] ms with mean 150 ms, and over [60, 140] with mean 100 ms for group 2. As before, Fig. 8 shows the similar trends for all the performance metrics measured at router 2, which demonstrate that our results hold for more general network configurations. Regarding the link utilization and packet loss ratio, we see some zigzags when the number of flows is small (up to around 200). For such small values of , it is still far away from the domain of convergence for Theorems 1 and 2, and the performance is mainly governed by other factors. For example, the buffer size in Table I is , and if is not large enough, the resulting queue 8We sampled the window size of each flow i every 20 ms except for some initial period. Since RTTs are between 120 and 280 ms in our simulation setup, we get 6 to 14 samples of the window size of each flow per RTT
This article has been accepted for inclusion in a future issue of this journal.Content is final as presented,with the exception of pagination 10 IEEE/ACM TRANSACTIONS ON NETWORKING 0 15 ◆HED1 RED1 4EXP EXP PDF of Aggregate Window Size Region for Packet Marking 20 RED2 (sw) 10 ◆RED2 Gaussian distribution(Nμ,O(N)) O(N) 500 1000 1500 500 1000 1500 The number of flows,N The number of flows,N 0.05 -RED1 0.99 △-EXP 0.04 RED2 O(NO.5 0.98 0.97 0.03 RED1 0.96 0.02 EXP 0.95 ◆-RED2 0.0 0 500 10001500 0 500 10001500 The number of flows,N The number of flows,N Fig.8.Multi-hop topology:Performance metrics with increasing number of flows (N)for fixed a =0.2 and =0.1. Fig.9.If we consider only the Gaussian marginal distribution of the aggregate window sizes,packet marking probability goes to zero as the region for packet marking O(N)is too small compared toO(V).However,under the random size will be more affected by the preceding factor,12,rather packet arrivals,packets 'P'outside of the marking region may also be marked than by Nate. as the random arrivals may create temporary queue buildup. VI.DISCUSSION since a 1/2.This means that the probability of packet A.Importance of Random Arrivals for Aggressive Scale marking goes to zero as N becomes larger,and,therefore, In Section II we already noticed that different ways of scaling the system will never be in the "steady-state"as(27)clearly lead to different modeling of the system.In particular,random contradicts(20).Thus,independence among flows alone is not packet arrivals over sub-RTT levels play a crucial role in cap- sufficient to explain the good performance of the TCP/AQM turing the dynamics of systems with O(1)scale [16],[11],[5]. system under the aggressive marking scale.This is because while coarser time-scale models capturing only RTT-level dy- the CLT-based argument lacks the dynamics of random packet namics suffice for systems with O(N)scale [10],[15],[13], arrivals within each RTT.For instance,in Fig.9,consider [19].In this section,we further illustrate that the random packet packets"P"located on the left side of the distribution.While arrivals,in conjunction with the random packet marking,are in- these packets are outside of the marking region and thus will dispensable components in our modeling for systems under the never be marked under the CLT-based approach,they may be so aggressive scale. under the random packet arrivals as these may create temporary Suppose that we were to ignore the randomness in packet ar- queue buildup.In fact,due to the random arrival effects,we rivals within each RTT and consider only the effect of random note that marking can take place anywhere on the x-axis in packet marking in our modeling.As observed in Section V-C, Fig.9.So,in some sense,the integration in (27)should have under the aggressive scale,the window size random variables been taken over a much larger interval (rather than over the for N flows are mostly independent and the aggregate window interval I of length O(N))such that the marking probability size is best matched by a Gaussian distribution with standard does not vanish as shown in (20). deviation of O(VN).While the buffer size and the region for marking are smaller thanO(VN),one may think that,by appro- B.Related Work on Hybrid Approach priately 'shifting'the Gaussian marginal distribution,it might In the literature,there have been several attempts to integrate be still possible to explain the high link utilization and low stochastic components into the fluid modeling for TCP/AQM packet loss via the CLT type of argument in the steady-state. systems under various scales [11],[5].In [11],the authors con- In this case,the average marking probability of a packet can sidered the usual fluid recursion driven by a Poisson input and be calculated by integrating the Gaussian marginal distribution obtained limiting deterministic equations of system dynamics over a region where the marking takes place,i.e an interval I for the window size evolutions as the system size increases. of length O(N)(See Fig.9).Thus,as N-oo,the average They considered O(N)and O(1)for packet marking and marking probability will become showed that the system behaves as if it were a rate-based one under O(1)scale. (marking}= 1 兴d /√2O(N) On the other hand,in [5],the authors considered all possible ways of scaling the buffer size,i.e.,B(N)=O(N)with 1 d 0,=1,and0<<1.The approach is still based on the fluid V2O(VNJ1 model,but with different reasoning obtained from the stochastic O(N) V2玩O(W -→0 (27) queueing theory for different y.Specifically,for 0<<1. the authors applied different open-loop queueing theories with
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10 IEEE/ACM TRANSACTIONS ON NETWORKING Fig. 8. Multi-hop topology: Performance metrics with increasing number of flows (N) for fixed = 0:2 and = 0:1. size will be more affected by the preceding factor, 12, rather than by . VI. DISCUSSION A. Importance of Random Arrivals for Aggressive Scale In Section II we already noticed that different ways of scaling lead to different modeling of the system. In particular, random packet arrivals over sub-RTT levels play a crucial role in capturing the dynamics of systems with scale [16], [11], [5], while coarser time-scale models capturing only RTT-level dynamics suffice for systems with scale [10], [15], [13], [19]. In this section, we further illustrate that the random packet arrivals, in conjunction with the random packet marking, are indispensable components in our modeling for systems under the aggressive scale. Suppose that we were to ignore the randomness in packet arrivals within each RTT and consider only the effect of random packet marking in our modeling. As observed in Section V-C, under the aggressive scale, the window size random variables for flows are mostly independent and the aggregate window size is best matched by a Gaussian distribution with standard deviation of . While the buffer size and the region for marking are smaller than , one may think that, by appropriately ‘shifting’ the Gaussian marginal distribution, it might be still possible to explain the high link utilization and low packet loss via the CLT type of argument in the steady-state. In this case, the average marking probability of a packet can be calculated by integrating the Gaussian marginal distribution over a region where the marking takes place, i.e., an interval of length (See Fig. 9). Thus, as , the average marking probability will become (27) Fig. 9. If we consider only the Gaussian marginal distribution of the aggregate window sizes, packet marking probability goes to zero as the region for packet marking O(N ) is too small compared to O( pN). However, under the random packet arrivals, packets ‘P’ outside of the marking region may also be marked as the random arrivals may create temporary queue buildup. since . This means that the probability of packet marking goes to zero as becomes larger, and, therefore, the system will never be in the “steady-state” as (27) clearly contradicts (20). Thus, independence among flows alone is not sufficient to explain the good performance of the TCP/AQM system under the aggressive marking scale. This is because the CLT-based argument lacks the dynamics of random packet arrivals within each RTT. For instance, in Fig. 9, consider packets “P” located on the left side of the distribution. While these packets are outside of the marking region and thus will never be marked under the CLT-based approach, they may be so under the random packet arrivals as these may create temporary queue buildup. In fact, due to the random arrival effects, we note that marking can take place anywhere on the -axis in Fig. 9. So, in some sense, the integration in (27) should have been taken over a much larger interval (rather than over the interval of length such that the marking probability does not vanish as shown in (20). B. Related Work on Hybrid Approach In the literature, there have been several attempts to integrate stochastic components into the fluid modeling for TCP/AQM systems under various scales [11], [5]. In [11], the authors considered the usual fluid recursion driven by a Poisson input and obtained limiting deterministic equations of system dynamics for the window size evolutions as the system size increases. They considered and for packet marking and showed that the system behaves as if it were a rate-based one under scale. On the other hand, in [5], the authors considered all possible ways of scaling the buffer size, i.e., with , and . The approach is still based on the fluid model, but with different reasoning obtained from the stochastic queueing theory for different . Specifically, for , the authors applied different open-loop queueing theories with