Second Edition Data Networks DIMITRI BERTSEKAS Massachusetts Institute of Technology ROBERT GALLAGER Massachusetts Institute of Technology PRENTICE HALL.Englewood Cliffs,New Jersey 07632
Second Edition Data Networks DIMITRI BERTSEKAS Massachusetts Institute of Technology ROBERT GALLAGER Massachusetts Institute of Technology PRENTICE HALL. Englewood Cliffs, New Jersey 07632
6 WX Window size 3 Time at the Time at the transmitter receiver Permit returns Flow Control 6.1 INTRODUCTION In most networks,there are circumstances in which the externally offered load is larger than can be handled even with optimal routing.Then,if no measures are taken to restrict the entrance of traffic into the network,queue sizes at bottleneck links will grow and packet delays will increase,possibly violating maximum delay specifications. Furthermore,as queue sizes grow indefinitely,the buffer space at some nodes may be exhausted.When this happens,some of the packets arriving at these nodes will have to be discarded and later retransmitted,thereby wasting communication resources.As a result,a phenomenon similar to a highway traffic jam may occur whereby,as the offered load increases,the actual network throughput decreases while packet delay becomes excessive.It is thus necessary at times to prevent some of the offered traffic from entering the network to avoid this type of congestion.This is one of the main functions of flow control. Flow control is also sometimes necessary between two users for speed matching. that is,for ensuring that a fast transmitter does not overwhelm a slow receiver with more packets than the latter can handle.Some authors reserve the term"flow control"for this 493
6 j Time at the transm iller Window size = 3 j Time at the receiver Permit returns Flow Control 6.1 INTRODUCTION In most networks, there are circumstances in which the externally offered load is larger than can be handled even with optimal routing. Then, if no measures are taken to restrict the entrance of traffic into the network, queue sizes at bottleneck links will grow and packet delays will increase, possibly violating maximum delay specifications. Furthermore, as queue sizes grow indefinitely, the buffer space at some nodes may be exhausted. When this happens, some of the packets arriving at these nodes will have to be discarded and later retransmitted, thereby wasting communication resources. As a result, a phenomenon similar to a highway traffic jam may occur whereby, as the offered load increases, the actual network throughput decreases while packet delay becomes excessive. It is thus necessary at times to prevent some of the offered traffic from entering the network to avoid this type of congestion. This is one of the main functions of flow control. Flow control is also sometimes necessary between two users for speed matching, that is, for ensuring that a fast transmitter does not overwhelm a slow receiver with more packets than the latter can handle. Some authors reserve the term "flow control" for this 493
494 Flow Control Chap.6 type of speed matching and use the term "congestion control"for regulating the packet population within the subnetwork.We will not make this distinction in terminology;the type and objective of flow control being discussed will be clear from the context. In this chapter we describe schemes currently used for fow control,explain their advantages and limitations,and discuss potential approaches for their improvement.In the remainder of this section we identify the principal means and objectives of flow control.In Sections 6.2 and 6.3 we describe the currently most popular flow control methods;window strategies and rate control schemes.In Section 6.4 we describe flow control in some representative networks.Section 6.5 is devoted to various algorithmic aspects of rate control schemes. 6.1.1 Means of Flow Control Generally.a need for flow control arises whenever there is a constraint on the commu- nication rate between two points due to limited capacity of the communication lines or the processing hardware.Thus.a flow control scheme may be required between two users at the transport layer,between a user and an entry point of the subnet (network layer),between two nodes of the subnet(network layer),or between two gateways of an interconnected network (internet layer).We will emphasize flow control issues within the subnet,since flow control in other contexts is in most respects similar. The term "session"is used somewhat loosely in this chapter to mean any communi- cation process to which flow control is applied.Thus a session could be a virtual circuit, a group of virtual circuits (such as all virtual circuits using the same path),or the entire packet flow originating at one node and destined for another node.Often,flow control is applied independently to individual sessions,but there is a strong interaction between its effects on different sessions because the sessions share the network's resources. Note that different sessions may have radically different service requirements. For example,sessions transferring files may tolerate considerable delay but may re- quire strict error control,while voice and video sessions may have strict minimum data rate and maximum end-to-end delay requirements,but may tolerate occasional packet losses. There are many approaches to flow control,including the following: 1.Call blocking.Here a session is simply blocked from entering the network (its access request is denied).Such control is needed,for example,when the session requires a minimum guaranteed data rate that the network cannot provide due to limited uncommitted transmission capacity.A typical situation is that of the voice telephone network and,more generally,circuit switched networks,all of which use flow control of this type.However,a call blocking option is also necessary in integrated voice,video,and data networks,at least with respect to those sessions requiring guaranteed rate.In a more general view of call blocking one may admit a session only after a negotiation of some "service contract,"for example,an agreement on some service parameters for the session's input traffic (maximum rate,minimum rate,maximum burst size.priority level,etc.)
494 Flow Control Chap. 6 type of speed matching and use the tenn "congestion control" for regulating the packet population within the subnetwork. We will not make this distinction in tenninology; the type and objective of flow control being discussed will be clear from the context. In this chapter we describe schemes currently used for flow control, explain their advantages and limitations, and discuss potential approaches for their improvement. In the remainder of this section we identify the principal means and objectives of flow control. In Sections 6.2 and 6.3 we describe the currently most popular flow control methods; window strategies and rate control schemes. In Section 6.4 we describe flow control in some representative networks. Section 6.5 is devoted to various algorithmic aspects of rate control schemes. 6.1.1 Means of Flow Control Generally, a need for flow control arises whenever there is a constraint on the communication rate between two points due to limited capacity of the communication lines or the processing hardware. Thus, a flow control scheme may be required between two users at the transport layer, between a user and an entry point of the subnet (network layer), between two nodes of the subnet (network layer), or between two gateways of an interconnected network (internet layer). We will emphasize flow control issues within the subnet, since flow control in other contexts is in most respects similar. The tenn "session" is used somewhat loosely in this chapter to mean any communication process to which flow control is applied. Thus a session could be a virtual circuit, a group of virtual circuits (such as all virtual circuits using the same path), or the entire packet flow originating at one node and destined for another node. Often, flow control is applied independently to individual sessions, but there is a strong interaction between its effects on different sessions because the sessions share the network's resources. Note that different sessions may have radically different service requirements. For example, sessions transferring files may tolerate considerable delay but may require strict error control, while voice and video sessions may have strict minimum data rate and maximum end-to-end delay requirements, but may tolerate occasional packet losses. There are many approaches to flow control, including the following: 1. Call blocking. Here a session is simply blocked from entering the network (its access request is denied). Such control is needed, for example, when the session requires a minimum guaranteed data rate that the network cannot provide due to limited uncommitted transmission capacity. A typical situation is that of the voice telephone network and, more generally, circuit switched networks, all of which use flow control of this type. However, a call blocking option is also necessary in integrated voice, video, and data networks, at least with respect to those sessions requiring guaranteed rate. In a more general view of call blocking one may admit a session only after a negotiation of some "service contract," for example, an agreement on some service parameters for the session's input traffic (maximum rate, minimum rate, maximum burst size, priority level, etc.)
Sec.6.1 Introduction 495 2.Packer discarding.When a node with no available buffer space receives a packet, it has no alternative but to discard the packet.More generally,however,packets may be discarded while buffer space is still available if they belong to sessions that are using more than their fair share of some resource.are likely to cause congestion for higher-priority sessions,are likely to be discarded eventually along their path, and so on.(Note that if a packet has to be discarded anyway,it might as well be discarded as early as possible to avoid wasting additional network resources unnecessarily.)When some of a session's packets are discarded,the session may need to take some corrective action,depending on its service requirements.For sessions where all packets carry essential information (e.g..file transfer sessions), discarded packets must be retransmitted by the source after a suitable timeout; such sessions require an acknowledgment mechanism to keep track of the pack- ets that failed to reach their destination.On the other hand,for sessions such as voice or video,where delayed information is useless,there is nothing to be done about discarded packets.In such cases,packets may be assigned different levels of priority.and the network may undertake the obligation never to discard the highest-priority packets-these are the packets that are sufficient to support the minimum acceptable quality of service for the session.The data rate of the highest-priority packets (the minimum guaranteed rate)may then be negotiated between the network and the source when the session is established.This rate may also be adjusted in real time,depending on the congestion level in the net- work. 3.Packet blocking.When a packet is discarded at some node.the network resources that were used to get the packet to that node are wasted.It is thus preferable to restrict a session's packets from entering the network if after entering they are to be discarded.If the packets carry essential information,they must wait in a queue outside the network:otherwise,they are discarded at the source.In the latter case,however,the flow control scheme must honor any agreement on a minimum guaranteed rate that the session may have negotiated when it was first established. 4.Packer scheduling.In addition to discarding packets.a subnetwork node can ex- ercise flow control by selectively expediting or delaying the transmission of the packets of various sessions.For example,a node may enforce a priority service discipline for transmitting packets of several different priorities on a given outgo- ing link.As another example,a node may use a(possibly weighted)round-robin scheduling strategy to ensure that various sessions can access transmission lines in a way that is consistent both with some fairess criterion and also with the minimum data rate required by the sessions.Finally,a node may receive infor- mation regarding congestion farther along the paths used by some sessions,in which case it may appropriately delay the transmission of the packets of those sessions. In subsequent sections we discuss specific strategies for throttling sources and for restricting traffic access to the network
Sec. 6.1 Introduction 495 2. Packet discarding. When a node with no available buffer space receives a packet, it has no alternative but to discard the packet. More generally, however, packets may be discarded while buffer space is still available if they belong to sessions that are using more than their fair share of some resource, are likely to cause congestion for higher-priority sessions, are likely to be discarded eventually along their path, and so on. (Note that if a packet has to be discarded anyway, it might as well be discarded as early as possible to avoid wasting additional network resources unnecessarily.) When some of a session's packets are discarded, the session may need to take some corrective action, depending on its service requirements. For sessions where all packets carry essential information (e.g., file transfer sessions), discarded packets must be retransmitted by the source after a suitable timeout; such sessions require an acknowledgment mechanism to keep track of the packets that failed to reach their destination. On the other hand, for sessions such as voice or video, where delayed information is useless, there is nothing to be done about discarded packets. In such cases, packets may be assigned different levels of priority, and the network may undertake the obligation never to discard the highest-priority packets-these are the packets that are sufficient to support the minimum acceptable quality of service for the session. The data rate of the highest-priority packets (the minimum guaranteed rate) may then be negotiated between the network and the source when the session is established. This rate may also be adjusted in real time, depending on the congestion level in the network. 3. Packet blocking. When a packet is discarded at some node, the network resources that were used to get the packet to that node are wasted. It is thus preferable to restrict a session's packets from entering the network if after entering they are to be discarded. If the packets carry essential information, they must wait in a queue outside the network; otherwise, they are discarded at the source. In the latter case, however, the flow control scheme must honor any agreement on a minimum guaranteed rate that the session may have negotiated when it was first established. 4. Packet scheduling. In addition to discarding packets, a subnetwork node can exercise flow control by selectively expediting or delaying the transmission of the packets of various sessions. For example, a node may enforce a priority service discipline for transmitting packets of several different priorities on a given outgoing link. As another example, a node may use a (possibly weighted) round-robin scheduling strategy to ensure that various sessions can access transmission lines in a way that is consistent both with some fairness criterion and also with the minimum data rate required by the sessions. Finally, a node may receive information regarding congestion farther along the paths used by some sessions, in which case it may appropriately delay the transmission of the packets of those sessions. In subsequent sections we discuss specific strategies for throttling sources and for restricting traffic access to the network
496 Flow Control Chap.6 6.1.2 Main Objectives of Flow Control We look now at the main principles that guide the design of flow control algorithms. Our focus is on two objectives.First.strike a good compromise between throttling sessions (subject to minimum data rate requirements)and keeping average delay and buffer overfow at a reasonable level.Second.maintain fairness between sessions in providing the requisite qualiry of service. Limiting delay and buffer overflow.We mentioned earlier that for important classes of sessions,such as voice and video,packets that are excessively delayed are useless.For such sessions,a limited delay is essential and should be one of the chief concerns of the flow control algorithm;for example,such sessions may be given high priority. For other sessions.a small average delay per packet is desirable but it may not be crucial.For these sessions,network level flow control does not necessarily reduce delay; it simply shifts delay from the network layer to higher layers.That is,by restricting entrance to the subnet,flow control keeps packets waiting outside the subnet rather than in queues inside it.In this way.however.flow control avoids wasting subnet resources in packet retransmissions and helps prevent a disastrous traffic jam inside the subnet.Retransmissions can occur in two ways:first.the buildup of queues causes buffer overflow to occur and packets to be discarded;second,slow acknowledgments can cause the source node to retransmit some packets because it thinks mistakenly that they have been lost.Retransmissions waste network resources,effectively reduce network throughput.and cause congestion to spread.The following example (from [GeK80]) illustrates how buffer overflow and attendant retransmissions can cause congestion. Example 6.1 Consider the five-node network shown in Fig.6.1(a).There are two sessions.one from top to bottom with a Poisson input rate of 0.8.and the other from left to right with a Poisson input rate f.Assume that the central node has a large but finite buffer pool that is shared on a first-come first-serve basis by the two sessions.If the buffer pool is full,an incoming packet is rejected and then retransmitted by the sending node.For small f.the buffer rarely fills up and the total throughput of the system is 0.8+f.When f is close to unity (which is the capacity of the rightmost link).the buffer of the central node is almost always full,while the top and left nodes are busy most of the time retransmitting packets.Since the left node is transmitting 10 times faster than the top node.it has a 10-fold greater chance of capturing a buffer space at the central node.so the left-to-right throughput will be roughly 10 times larger than the top-to-bottom throughput.The left-to-right throughput will be roughly unity (the capacity of the rightmost link).so that the total throughput will be roughly 1.1.This is illustrated in more detail in Fig.6.1(b).where it can be seen that the total throughput decreases toward 1.I as the offered load f increases. This example also illustrates how with buffer overflow.some sessions can capture almost all the buffers and nearly prevent other sessions from using the network.To avoid this,it is sometimes helpful to implement a buffer management scheme.In such a scheme.packets are divided in different classes based,for example,on origin,destination
496 6.1.2 Main Objectives of Flow Control Flow Control Chap. 6 We look now at the main principles that guide the design of flow control algorithms. Our focus is on two objectives. First, strike a good compromise hetween throttling sessions (suhject to minimum data rate requirements) and keeping average delay and huller overflow at a reasonable level. Second, maintain fairness hetween sessions in providing the requisite qualit..... of service. Limiting delay and buffer overflow. We mentioned earlier that for important classes of sessions, such as voice and video, packets that are excessively delayed are useless. For such sessions, a limited delay is essential and should be one of the chief concerns of the flow control algorithm; for example, such sessions may be given high priority. For other sessions, a small average delay per packet is desirable but it may not be crucial. For these sessions, network level flow control does not necessarily reduce delay; it simply shifts delay from the network layer to higher layers. That is, by restricting entrance to the subnet, flow control keeps packets waiting outside the subnet rather than in queues inside it. In this way, however, flow control avoids wasting subnet resources in packet retransmissions and helps prevent a disastrous traffic jam inside the subnet. Retransmissions can occur in two ways: first, the buildup of queues causes buffer overflow to occur and packets to be discarded; second, slow acknowledgments can cause the source node to retransmit some packets because it thinks mistakenly that they have been lost. Retransmissions waste network resources, effectively reduce network throughput, and cause congestion to spread. The following example (from lGeK80]) illustrates how buffer overflow and attendant retransmissions can cause congestion. Example 6,1 Consider the five-node network shown in Fig. 6.1 (a). There are two sessions, one from top to bottom with a Poisson input rate of 0.8. and the other from left to right with a Poisson input rate f. Assume that the central node has a large but finite buffer pool that is shared on a first-come first-serve basis by the two sessions. If the buffer pool is full, an incoming packet is rejected and then retransmitted by the sending node. For small f, the buffer rarely tills up and the total throughput of the system is 0.8 +f. When f is close to unity (which is the capacity of the rightmost link), the buffer of the central node is almost always full, while the top and left nodes are busy most of the time retransmitting packets. Since the left node is transmitting 10 times faster than the top node. it has a la-fold greater chance of capturing a buffer space at the central node. so the left-to-right throughput will be roughly 10 times larger than the top-to-boltom throughput. The left-to-right throughput will be roughly unity (the capacity of the rightmost link). so that the total throughput will be roughly 1.1. This is illustrated in more detail in Fig. 6.i(b). where it can be seen that the total throughput decreases toward i.1 as the offered load f increases. This example also illustrates how with buffer overflow. some sessions can capture almost all the buffers and nearly prevent other sessions from using the network. To avoid this, it is sometimes helpful to implement a hutler management scheme. In such a scheme. packets are divided in different classes based, for example, on origin, destination
Sec.6.1 Introduction 497 Input rate 0.8 Limited buffer space Low capacity link C=1 Input rate High capacity A link C 10 B Low capacity link C=1 Low capacity link C=1 D (a) 1.8 Throughput when infinite buffer space is available indy6nojy at the central node 1.1 0.8 Retransmissions due to limited buffer space at the central node start here 0 1.0 Input Rate f of A to B Session (b) Figure 6.1 Example demonstrating throughput degradation due to retransmissions caused by buffer overllow.(a)For f approaching unity.the central node buffer is almost always full.thereby causing retransmissions.Because the A-to-B session uses a line 10 times faster than the C-to-D session.it has a 10-fold greater chance of capturing a free buffer and getting a packet accepted at the central node.As a result.the throughput of the A- to-B session approaches unity.while the throughput of the (-to-D session approaches 0.1.(b)Total throughput as a function of the input rate of the A-t0-B session. or number of hops traveled so far;at each node,separate buffer space is reserved for different classes,while some buffer space is shared by all classes. Proper buffer management can also help avoid deadlocks due to buffer overflow. Such a deadlock can occur when two or more nodes are unable to move packets due to unavailable space at all potential receivers.The simplest example of this is two
Sec. 6.1 Introduction Input rate 0.8 497 Input rate f ::J C. .<: '"::J 1.1 :: .<: f- ro 0.8 0 f- A Limited buffer space '\, High capacity link C 10 Retransm issions due to limited buffer space at the central node start here B Low capaciw link C 1 (a) 1.8 ,---,,--' "". - ------- -- ...................... 1 .\ I \ Throughput when infinite buffer space is available at the central node I -~---r-~------------ I a 1.0 Input Rate f of A to B Session (b) Figure 6.1 Example demonstrating throughput degradation due to retransmissions caused by butler overflow. (a) For f approaching unity. the central node buffer is almost always full. thereby causing retransmissions. Because the A-to-B session uses a line 10 times faster than the C-to-D session. it has a 1O-fold greater chance of capturing a free buffer and getting a packet accepted at the central node. As a result. the throughput of the .4- to-B session approaches unity. while the throughput of the C-to-D session approaches 0.1. (b) Total throughput as a function of the input rate of the A-to-B session. or number of hops traveled so far; at each node, separate buffer space is reserved for different classes, while some buffer space is shared by all classes. Proper buffer management can also help avoid deadlocks due to buffer overflow. Such a deadlock can occur when two or more nodes are unable to move packets due to unavailable space at all potential receivers. The simplest example of this is two
498 Flow Control Chap.6 nodes A and B routing packets directly to each other as shown in Fig.6.2(a).If all the buffers of both A and B are full of packets destined for B and A,respectively. then the nodes are deadlocked into continuously retransmitting the same packets with no success,as there is no space to store the packets at the receiver.This problem can also occur in a more complex manner whereby more than two nodes arranged in a cycle are deadlocked because their buffers are full of packets destined for other nodes in the cycle [see Fig.6.2(b)].There are simple buffer management schemes that preclude this type of deadlock by organizing packets in priority classes and allocating extra buffers for packets of higher priority ([RaH76]and [Gop85]).A typical choice is to assign a level of priority to a packet equal to the number of links it has traversed in the network, as shown in Fig.6.3.If packets are not allowed to loop,it is then possible to show that a deadlock of the type just described cannot occur. We finally note that when offered load is large,limited delay and buffer overflow can be achieved only by lowering the input to the network.Thus,there is a natural trade-off between giving sessions free access to the network and keeping delay at a level low enough so that retransmissions or other inefficiencies do not degrade network performance.A somewhat oversimplified guideline is that,ideally,flow control should not be exercised at all when network delay is below some critical level,and,under heavy load conditions,should reject as much offered traffic as necessary to keep delay at the critical level.Unfortunately.this is easier said than done,since neither delay nor throughput can be represented meaningfully by single numbers in a flow control context. Fairness.When offered traffic must be cut back,it is important to do so fairly. The notion of fairness is complicated,however,by the presence of different session priorities and service requirements.For example,some sessions need a minimum guar- anteed rate and a strict upper bound on network delay.Thus,while it is appropriate to consider simple notions of fairness within a class of"similar"sessions,the notion of Figure 6.2 Deadlock due to buffer overflow.In (a).all buffers of A and B are full of packets destined for B and A. D respectively.As a result,no packet can be accepted at either node.In(b).all buffers (b) of A.B.C.and D are full of packets destined for C.D.A.and B.respectively
498 Flow Control Chap. 6 nodes A and B routing packets directly to each other as shown in Fig. 6.2(a). If all the buffers of both A and B are full of packets destined for B and A, respectively, then the nodes are deadlocked into continuously retransmitting the same packets with no success, as there is no space to store the packets at the receiver. This problem can also occur in a more complex manner whereby more than two nodes arranged in a cycle are deadlocked because their buffers are full of packets destined for other nodes in the cycle [see Fig. 6.2(b)]. There are simple buffer management schemes that preclude this type of deadlock by organizing packets in priority classes and allocating extra buffers for packets of higher priority ([RaH76] and [Gop85]). A typical choice is to assign a level of priority to a packet equal to the number of links it has traversed in the network, as shown in Fig. 6.3. If packets are not allowed to loop, it is then possible to show that a deadlock of the type just described cannot occur. We finally note that when offered load is large, limited delay and buffer overflow can be achieved only by lowering the input to the network. Thus, there is a natural trade-off between giving sessions free access to the network and keeping delay at a level low enough so that retransmissions or other inefficiencies do not degrade network performance. A somewhat oversimplified guideline is that, ideally, flow control should not be exercised at all when network delay is below some critical level, and, under heavy load conditions, should reject as much offered traffic as necessary to keep delay at the critical level. Unfortunately, this is easier said than done, since neither delay nor throughput can be represented meaningfully by single numbers in a flow control context. Fairness. When offered traffic must be cut back, it is important to do so fairly. The notion of fairness is complicated, however, by the presence of different session priorities and service requirements. For example, some sessions need a minimum guaranteed rate and a strict upper bound on network delay. Thus, while it is appropriate to consider simple notions of fairness within a class of "similar" sessions, the notion of (a) (b) Figure 6.2 Deadlock due to buffer overflow. In (a). all buffers of A and B are full of packets destined for B and A. respectively. As a result. no packet can be accepted at either node. In (b), all buffers of A. B. C. and D are full of packets destined for C. D, A. and B. respectively
Sec.6.2 Window Flow Control 499 Class N-1 Class N-1 。年0 。。。 Class+1 Class Buffer space Buffer space for packets for packets 4。t that have that have travelled Class 1 Class 1 travelled klinks +1 links Class 0 Class 0 Figure 6.3 Organization of node memory in buffer classes to avoid deadlock due to buffer overflow.A packet that has traveled k links is accepted at a node only if there is an available buffer of class or lower.where k ranges from 0 to N-I (where N is the number of nodes).Assuming that packets that travel more than N-I links are discarded as having traveled in a toop.no deadlock occurs.The proof consists of showing by induction (starting with k=N-1)that at each node the buffers of class k cannot fill up permanently. fairness between classes is complex and involves the requirements of those classes.Even within a class,there are several notions of fairness that one may wish to adopt.The example of Fig.6.4 illustrates some of the contradictions inherent in choosing a fairness criterion.There are n+I sessions each offering I unit/sec of traffic along a sequence of n links with capacity of I unit/sec.One session's traffic goes over all n links,while the rest of the traffic goes over only one link.A maximum throughput of n units/sec can be achieved by accepting all the traffic of the single-link sessions while shutting off the n-link session.However,if our objective is to give equal rate to all sessions,the resulting throughput is only (n+1)/2 units/sec.Alternatively,if our objective is to give equal resources to all sessions,the single link sessions should get a rate of n/(n +1) units/sec,while the n-link session should get a rate of 1/(n+1)units/sec. Generally,a compromise is required between equal access to a network resource and throttling those sessions that are most responsible for using up that resource.Achiev- ing the proper balance,however,is a design decision that may be hard to quantify:often such decisions must be reached by trial and error. n-link user 1 unit/sec n links each with capacity 1 unit/sec Single link user Single link user Single link user Single link user 1 unit/sec 1 unit/sec 1 unit/sec 1 unit/sec Figure 6.4 Example showing that maximizing total throughput may be incompatible with giving equal throughput to all sessions.A maximum throughput of n units/sec can be achieved if the n-link session is shut off completely.Giving equal rates of 1/2 unit/sec to all sessions achieves a throughput of only (n+1)/2 units/sec
Sec. 6.2 Window Flow Control 499 Bu ffer space for packets that have travelled k + , links (~":~-~ f-C_I_as_.s_.N_._-_'--t '\ -- Class k + , Class k L-~-----I-ff-------i [ \ Class' ) \ Class' J ~Clas----ls a ~Clas----ls a L~~L~~-t/7-----""",,,~~ / ----- Buffer space for packets that have travelled k links Figure 6.3 Organization of node memory in buffer classes to avoid dcadlock due to buffer overflow. A packet that has traveled k links is accepted at a node only if thcre is an available buffer of class k or lower. wherc k ranges from 0 to IV - I (where IV is the numbcr of nodes). Assuming that packets that travel more than IV - I links are discarded as having traveled in a loop. no deadlock occurs. Thc proof consists of showing by induction (starting with k = IV - I) that at each node the buffers of class k cannot fill up permanently. fairness between classes is complex and involves the requirements of those classes. Even within a class, there are several notions of fairness that one may wish to adopt. The example of Fig. 6.4 illustrates some of the contradictions inherent in choosing a fairness criterion. There are n + 1 sessions each offering I unit/sec of traffic along a sequence of n links with capacity of 1 unit/sec. One session's traffic goes over all n links, while the rest of the traffic goes over only one link. A maximum throughput of n units/sec can be achieved by accepting all the traffic of the single-link sessions while shutting off the n-link session. However, if our objective is to give equal rate to all sessions, the resulting throughput is only (n + 1)12 units/sec. Alternatively, if our objective is to give equal resources to all sessions, the single link sessions should get a rate of nl(n + 1) units/sec, while the fl.-link session should get a rate of I/(n + 1) units/sec. Generally, a compromise is required between equal access to a network resource and throttling those sessions that are most responsible for using up that resource. Achieving the proper balance, however, is a design decision that may be hard to quantify; often such decisions must be reached by trial and error. n-link user , unit/sec \ n links each with capacity' unit/sec p: (~I • 1C: Single link user ~ingle link user Single link user , unit/sec , unit/sec , unit/sec Figure 6.4 Example showing that maximizing total throughput may be incompatible with giving equal throughput to all sessions. A maximum throughput of n units/sec can be achieved if the It-link scssion is shut off completely. Giving equal rates of 1/2 unit/sec to all sessions achieves a throughput of only (It T 1)/2 units/scc-
500 Flow Control Chap.6 6.2 WINDOW FLOW CONTROL In this section we describe the most frequently used class of flow control methods.In Sections 6.2.1 to 6.2.3,the main emphasis is on flow control within the communication subnet.Flow control outside the subnet,at the transport layer,is discussed briefly in Section 6.2.4. A session between a transmitter A and a receiver B is said to be window'flow controlled if there is an upper bound on the number of data units that have been trans- mitted by A and are not yet known by A to have been received by B(see Fig.6.5).The upper bound (a positive integer)is called the window size or,simply,the window.The transmitter and receiver can be,for example.two nodes of the communication subnet,a user's machine and the entry node of the communication subnet,or the users'machines at the opposite ends of a session.Finally,the data units in a window can be messages, packets,or bytes,for example. The receiver B notifies the transmitter A that it has disposed of a data unit by sending a special message to A,which is called a permit (other names in the literature are acknowledgment,allocate message,etc.).Upon receiving a permit,A is free to send one more data unit to B.Thus,a permit may be viewed as a form of passport that a data unit must obtain before entering the logical communication channel between A and B.The number of permits in use should not exceed the window size. Permits are either contained in special control packets.or are piggybacked on regular data packets.They can be implemented in a number of ways:see the practical examples of Section 6.4 and the following discussion.Note also that a window flow control scheme for a given session may be combined with an error control scheme for the session,where the permits also play the role of acknowledgments;see Section 2.8.2 and the descriptions of the ARPANET and the Codex network in Section 6.4. The general idea in the window strategy is that the input rate of the transmitter is reduced when permits return slowly.Therefore,if there is congestion along the communication path of the session.the attendant large delays of the permits cause a natural slowdown of the transmitter's data rate.However,the window strategy has an additional dimension,whereby the receiver may intentionally delay permits to restrict DU DU Transmitter Receiver Permit Permit Total number of data units and permits window size WAs Figure 6.5 Window flow control between a transmitter and a receiver consists of an upper bound on the number of data units and permits in transit inside the network
500 6.2 WINDOW FLOW CONTROL Flow Control Chap. 6 In this section we describe the most frequently used class of flow control methods. In Sections 6.2.1 to 6.2.3, the main emphasis is on flow control within the communication subnet. Flow control outside the subnet, at the transport layer, is discussed briefly in Section 6.2.4. A session between a transmitter A and a receiver B is said to be window flow controlled if there is an upper bound on the number of data units that have been transmitted by A and are not yet known by A to have been received by B (see Fig. 6.5). The upper bound (a positive integer) is called the window size or, simply, the window. The transmitter and receiver can be, for example, two nodes of the communication subnet, a user's machine and the entry node of the communication subnet, or the users' machines at the opposite ends of a session. Finally, the data units in a window can be messages, packets, or bytes, for example. The receiver B notifies the transmitter A that it has disposed of a data unit by sending a special message to A, which is called a permit (other names in the literature are acknowledgment, allocate message, etc.). Upon receiving a permit, A is free to send one more data unit to B. Thus, a permit may be viewed as a form of passport that a data unit must obtain before entering the logical communication channel between A and B. The number of permits in use should not exceed the window size. Permits are either contained in special control packets, or are piggybacked on regular data packets. They can be implemented in a number of ways; see the practical examples of Section 6.4 and the following discussion. Note also that a window flow control scheme for a given session may be combined with an error control scheme for the session, where the permits also play the role of acknowledgments; see Section 2.8.2 and the descriptions of the ARPANET and the Codex network in Section 6.4. The general idea in the window strategy is that the input rate of the transmitter is reduced when permits return slowly. Therefore, if there is congestion along the communication path of the session, the attendant large delays of the permits cause a natural slowdown of the transmitter's data rate. However, the window strategy has an additional dimension, whereby the receiver may intentionally delay permits to restrict Transmitter Receiver A B ~~ Total number of data units and permits <; window size WAS Figure 6.5 Window flow control between a transmitter and a receiver consists of an upper bound \ \'.4 [J on the number of data units and permits in transit inside the network
Sec.6.2 Window Flow Control 501 the transmission rate of the session.For example,the receiver may do so to avoid buffer overflow. In the subsequent discussion,we consider two strategies,end-to-end and node-by- node windowing.The first strategy refers to flow control between the entry and exit subnet nodes of a session,while the second strategy refers to flow control between every pair of successive nodes along a virtual circuit's path. 6.2.1 End-to-End Windows In the most common version of end-to-end window flow control,the window size is alf',where a and II'are some positive numbers.Each time a new batch of a data units is received at the destination node,a permit is sent back to the source allocating a new batch of a data units.In a variation of this scheme,the destination node will send a new a-data unit permit upon reception of just the first of an a-data unit batch.(See the SNA pacing scheme description in Section 6.3.)To simplify the following exposition, we henceforth assume that a =1,but our conclusions are valid regardless of the value of a.Also,for concreteness,we talk in terms of packets,but the window maintained may consist of other data units such as bytes. Usually,a numbering scheme for packets and permits is used so that permits can be associated with packets previously transmitted and loss of permits can be detected.One possibility is to use a sliding window protocol similar to those used for data link control, whereby a packet contains a sequence number and a request number.The latter number can serve as one or more permits for flow control purposes (see also the discussion in Section 2.8.2).For example,suppose that node A receives a packet from node B with request number k.Then A knows that B has disposed of all packets sent by A and numbered less than k,and therefore A is free to send those packets up to number +-I that it has not sent yet.where W'is the window size.In such a scheme,both the sequence number and the request number are represented modulo m,where m >I+1. One can show that if packet ordering is preserved between transmitter and receiver,this representation of numbers is adequate;the proof is similar to the corresponding proof for the goback n ARQ system.In some networks the end-to-end window scheme is combined with an end-to-end retransmission protocol,and a packet is retransmitted if following a suitable timeout,the corresponding permit has not returned to the source. To simplify the subsequent presentation,the particular manner in which permits are implemented will be ignored.It will be assumed that the source node simply counts the number of packets it has already transmitted but for which it has not yet received back a permit,and transmits new packets only as long as r W. Figure 6.6 shows the flow of packets for the case where the round-trip delay d, including round-trip propagation delay,packet transmission time,and permit delay is smaller than the time required to transmit the full window of W packets,that is, d≤IIX where X is the transmission time of a single packet.Then the source is capable of transmitting at the full speed of 1/X packets/sec,and flow control is not active.(To
Sec. 6.2 Window Flow Control 501 the transmission rate of the session. For example, the receiver may do so to avoid buffer overflow. In the subsequent discussion, we consider two strategies, end-fa-end and node-bynode windowing. The first strategy refers to flow control between the entry and exit subnet nodes of a session, while the second strategy refers to flow control between every pair of successive nodes along a virtual circuit's path. 6.2.1 End-to-End Windows In the most common version of end-to-end window flow control, the window size is on-, where a and n° are some positive numbers. Each time a new batch of a data units is received at the destination node, a permit is sent back to the source allocating a new batch of a data units. In a variation of this scheme, the destination node will send a new a-data unit permit upon reception of just the first of an Q-data unit batch. (See the SNA pacing scheme description in Section 6.3.) To simplify the following exposition, we henceforth assume that 0 = I, but our conclusions are valid regardless of the value of a. Also, for concreteness, we talk in terms of packets, but the window maintained may consist of other data units such as bytes. Usually, a numbering scheme for packets and permits is used so that permits can be associated with packets previously transmitted and loss of permits can be detected. One possibility is to use a sliding window protocol similar to those used for data link control, whereby a packet contains a sequence number and a request number. The latter number can serve as one or more permits for flow control purposes (see also the discussion in Section 2.8.2). For example, suppose that node A receives a packet from node B with request number k. Then A knows that B has disposed of all packets sent by A and numbered less than k, and therefore A is free to send those packets up to number k+ W - I that it has not sent yet, where VV is the window size. In such a scheme, both the sequence number and the request number are represented modulo Tn, where Tn 2 TV +1. One can show that if packet ordering is preserved between transmitter and receiver, this representation of numbers is adequate; the proof is similar to the corresponding proof for the goback n ARQ system. In some networks the end-to-end window scheme is combined with an end-to-end retransmission protocol, and a packet is retransmitted if following a suitable timeout, the corresponding permit has not returned to the source. To simplify the subsequent presentation, the particular manner in which permits are implemented wiIl be ignored. It will be assumed that the source node simply counts the number ~. of packets it has already transmitted but for which it has not yet received back a permit, and transmits new packets only as long as x: < VV. Figure 6.6 shows the flow of packets for the case where the round-trip delay d, including round-trip propagation delay, packet transmission time, and permit delay is smaller than the time required to transmit the full window of VV packets, that is, where X is the transmission time of a single packet. Then the source is capable of transmitting at the full speed of 1/X packets/sec, and flow control is not active. (To