5 5 units 5 units Origin Origin All tinks have a capacity of 10 units Destination Routing in Data Networks 5.1 INTRODUCTION We have frequently referred to the routing algorithm as the network layer protocol that guides packets through the communication subnet to their correct destination.The times at which routing decisions are made depend on whether the network uses datagrams or virtual circuits.In a datagram network,two successive packets of the same user pair may travel along different routes,and a routing decision is necessary for each individual packet (see Fig.5.1).In a virtual circuit network,a routing decision is made when each virtual circuit is set up.The routing algorithm is used to choose the communication path for the virtual circuit.All packets of the virtual circuit subsequently use this path up to the time that the virtual circuit is either terminated or rerouted for some reason(see Fig.5.2). Routing in a network typically involves a rather complex collection of algorithms that work more or less independently and yet support each other by exchanging services or information.The complexity is due to a number of reasons.First,routing requires coordination between all the nodes of the subnet rather than just a pair of modules as, for example,in data link and transport layer protocols.Second,the routing system must 363
5 3 5 units 5 units All links have a capacity of 10 units Routing in Data Networks 5.1 INTRODUCTION We have frequently referred to the routing algorithm as the network layer protocol that guides packets through the communication subnet to their correct destination. The times at which routing decisions are made depend on whether the network uses datagrams or virtual circuits. In a datagram network, two successive packets of the same user pair may travel along different routes, and a routing decision is necessary for each individual packet (see Fig. 5.1). In a virtual circuit network, a routing decision is made when each virtual circuit is set up. The routing algorithm is used to choose the communication path for the virtual circuit. All packets of the virtual circuit subsequently use this path up to the time that the virtual circuit is either terminated or rerouted for some reason (see Fig. 5.2). Routing in a network typically involves a rather complex collection of algorithms that work more or less independently and yet support each other by exchanging services or information. The complexity is due to a number of reasons. First, routing requires coordination between all the nodes of the subnet rather than just a pair of modules as, for example, in data link and transport layer protocols. Second, the routing system must 363
364 Routing in Data Networks Chap.5 cope with link and node failures,requiring redirection of traffic and an update of the databases maintained by the system.Third,to achieve high performance,the routing algorithm may need to modify its routes when some areas within the network become congested. The main emphasis will be on two aspects of the routing problem.The first has to do with selecting routes to achieve high performance.In Sections 5.2.3 to 5.2.5, we discuss algorithms based on shortest paths that are commonly used in practice.In Sections 5.5,5.6,and 5.7 we describe sophisticated routing algorithms that try to achieve near optimal performance.The second aspect of routing that we will emphasize is broadcasting routing-related information (including link and node failures and repairs) to all network nodes.This issue and the subtleties associated with it are examined in Section 5.3. The introductory sections set the stage for the main development.The remainder of this section explains in nonmathematical terms the main objectives in the routing problem and provides an overview of current routing practice.Sections 5.2.1 to 5.2.3 present some of the main notions and results of graph theory,principally in connection with shortest paths and minimum weight spanning trees.Section 5.4 uses the material on graph theory to describe methods for topological design of networks.Finally,Section 5.8 reviews the routing system of the Codex network and its relation to the optimal routing algorithm of Section 5.7. 回+@回回回回 Packet from user n 回②+回+ 2 3 3 Figure 5.1 Routing in a datagram network.Two packets of the same user pair can travel along different routes.A routing decision is required for each individual packet
364 Routing in Data Networks Chap. 5 cope with link and node failures, requiring redirection of traffic and an update of the databases maintained by the system. Third, to achieve high performance, the routing algorithm may need to modify its routes when some areas within the network become congested. The main emphasis will be on two aspects of the routing problem. The first has to do with selecting routes to achieve high performance. In Sections 5.2.3 to 5.2.5, we discuss algorithms based on shortest paths that are commonly used in practice. In Sections 5.5, 5.6, and 5.7 we describe sophisticated routing algorithms that try to achieve near optimal performance. The second aspect of routing that we will emphasize is broadcasting routing-related information (including link and node failures and repairs) to all network nodes. This issue and the subtleties associated with it are examined in Section 5.3. The introductory sections set the stage for the main development. The remainder of this section explains in nonmathematical terms the main objectives in the routing problem and provides an overview of current routing practice. Sections 5.2.1 to 5.2.3 present some of the main notions and results of graph theory, principally in connection with shortest paths and minimum weight spanning trees. Section 5.4 uses the material on graph theory to describe methods for topological design of networks. Finally, Section 5.8 reviews the routing system of the Codex network and its relation to the optimal routing algorithm of Section 5.7. B-- Packet from user n Figure 5.1 Routing in a datagram network. Two packets of the same user pair can travel along different routes. A routing decision is required for each individual packet
Sec.5.1 Introduction 365 Packet from user n 2 ④→包+回+ 回 Figure 5.2 Routing in a virtual circuit network.All packets of each virtual circuit use the same path.A routing decision is required only when a virtual circuit is set up. 5.1.1 Main Issues in Routing The two main functions performed by a routing algorithm are the selection of routes for various origin-destination pairs and the delivery of messages to their correct destination once the routes are selected.The second function is conceptually straightforward using a variety of protocols and data structures (known as routing tables),some of which will be described in the context of practical networks in Section 5.1.2.The focus will be on the first function (selection of routes)and how it affects network performance. There are two main performance measures that are substantially affected by the routing algorithm-throughput(quantity of service)and average packet delay (quality of service).Routing interacts with flow control in determining these performance measures Delay Offered load Throughput Flow control Routing Delay Rejected load Figure 5.3 Interaction of routing and flow control.As good routing keeps delay low. flow control allows more traffic into the network
Sec. 5.1 Introduction 365 : Packet from user n Figure 5.2 Routing in a virtual circuit network. All packets of each virtual circuit use the same path. A routing decision is required only when a virtual circuit is set up. 5.1.1 Main Issues in Routing The two main functions perfonned by a routing algorithm are the selection of routes for various origin-destination pairs and the delivery of messages to their correct destination once the routes are selected. The second function is conceptually straightforward using a variety of protocols and data structures (known as routing tables), some of which will be described in the context of practical networks in Section 5.1.2. The focus will be on the first function (selection of routes) and how it affects network perfonnance. There are two main perfonnance measures that are substantially affected by the routing algorithm-throughput (quantity of service) and average packet delay (quality of service). Routing interacts with flow control in detennining these perfonnance measures +e ay Offered load Flow control Throughput Routing IRejected load r-----------------~ I D I : I I I _~ Delay Figure 5.3 Interaction of routing and flow control. As good routing keeps delay low, flow control allows more traffic into the network
366 Routing in Data Networks Chap.5 by means of a feedback mechanism shown in Fig.5.3.When the traffic load offered by the external sites to the subnet is relatively low,it will be fully accepted into the network,that is, throughput offered load When the offered load is excessive,a portion will be rejected by the flow control algo- rithm and throughput offered load-rejected load The traffic accepted into the network will experience an average delay per packet that will depend on the routes chosen by the routing algorithm.However,throughput will also be greatly affected(if only indirectly)by the routing algorithm because typical flow control schemes operate on the basis of striking a balance between throughput and delay (i.e.,they start rejecting offered load when delay starts getting excessive).Therefore,as the routing algorithm is more successful in keeping delay low,the fow control algorithm allows more traffic into the network.While the precise balance between delay and throughput will be determined by flow control,the effect of good routing under high offered load conditions is to realize a more favorable delay-throughput curve along which flow control operates, as shown in Fig.5.4. The following examples illustrate the discussion above: Example 5.1 In the network of Fig.5.5,all links have capacity 10 units.(The units by which link capacity and traffic load is measured is immaterial and is left unspecified.)There is a single destination (node 6)and two origins (nodes 1 and 2).The offered load from each of nodes I and 2 to node 6 is 5 units.Here,the offered load is light and can easily be accommodated with small delay by routing along the leftmost and rightmost paths,I -3-6 and 2-5→6,respectively.If instead.however,.the routes I一46and2→4→6are used,the flow on link(4,6)will equal capacity,resulting in very large delays. Example 5.2 For the same network,assume that the offered loads at nodes I and 2 are 5 and 15 units, respectively (see Fig.5.6).If routing from node 2 to the destination is done along a single path,then at least 5 units of offered load will have to be rejected since all path capacities Poor routing Good routing Figure 5.4 Delay-throughput operating Throughput curves for good and bad routing
366 Routing in Data Networks Chap. 5 by means of a feedback mechanism shown in Fig. 5.3. When the traffic load offered by the external sites to the subnet is relatively low, it will be fully accepted into the network, that is, throughput = offered load When the offered load is excessive, a portion will be rejected by the flow control algorithm and throughput = offered load - rejected load The traffic accepted into the network will experience an average delay per packet that will depend on the routes chosen by the routing algorithm. However, throughput will also be greatly affected (if only indirectly) by the routing algorithm because typical flow control schemes operate on the basis of striking a balance between throughput and delay (i.e., they start rejecting offered load when delay starts getting excessive). Therefore, as the routing algorithm is more successful in keeping delay low, the flow control algorithm allows more traffic into the network. While the precise balance between delay and throughput will be determined by flow control, the effect of good routing under high offered load conditions is to realize a more favorable delay-throughput curve along which flow control operates, as shown in Fig. 5.4. The following examples illustrate the discussion above: Example 5.1 In the network of Fig. 5.5, all links have capacity 10 units. (The units by which link capacity and traffic load is measured is immaterial and is left unspecified.) There is a single destination (node 6) and two origins (nodes I and 2). The offered load from each of nodes I and 2 to node 6 is 5 units. Here, the offered load is light and can easily be accommodated with small delay by routing along the leftmost and rightmost paths, I -+ 3 -+ 6 and 2 -+ 5 -+ 6, respectively. If instead, however, the routes I -+ 4 -+ 6 and 2 -+ 4 -+ 6 are used, the flow on link (4,6) will equal capacity, resulting in very large delays. Example 5.2 For the same network, assume that the offered loads at nodes I and 2 are 5 and 15 units, respectively (see Fig. 5.6). If routing from node 2 to the destination is done along a single path, then at least 5 units of offered load will have to be rejected since all path capacities > co Q) Cl Poor routing Throughput Figure 5.4 Delay-throughput operating curves for good and bad routing
Sec.5.1 Introduction 367 5 units 5 units Origin Origin All links have a capacity of 10 units Figure 5.5 Network for Example 5.1.All links have a capacity of 10 units.If all traffic is routed through the middle link (4.6),congestion occurs.If.instead.paths Destination (1-3→6)and(2一5一6)are used. the average delay is small. equal 10.Thus,the total throughput can be no more than 15 units.On the other hand. suppose that the traffic originating at node 2 is evenly split between the two paths 2-4-6 and 2-5-6,while the traffic originating at node I is routed along I-3-6.Then. the traffic arrival rate on each link will not exceed 75%of capacity.the delay per packet will be reasonably small,and (given a good flow control scheme)no portion of the offered load will be rejected.Arguing similarly,it is seen that when the offered loads at nodes I and 2 are both large,the maximum total throughput that this network can accommodate is between 10 and 30 units,depending on the routing scheme.This example also illustrates that to achieve high throughput,the traffic of some origin-destination pairs may have to be divided among more than one route. In conclusion,the effect of good routing is to increase throughput for the same value of average delay per packet under high offered load conditions and to decrease average delay per packet under low and moderate offered load conditions.Furthermore,it is evident that the routing algorithm should be operated so as to keep average delay per packet as low as possible for any given level of offered load.While this is easier said than done,it provides a clear-cut objective that can be expressed mathematically and dealt with analytically. 5.1.2 Wide Area Network Routing:An Overview The purpose of this section is to survey current routing practice in wide area networks. to introduce classifications of different schemes,and to provide a context for the analysis presented later. There are a number of ways to classify routing algorithms.One way is to divide them into centralized and distribured.In centralized algorithms,all route choices are made at a central node,while in distributed algorithms,the computation of routes is
3 Sec. 5.1 5 units Introduction 5 units 5 All links have a capacity of 10 units 367 Figure 5.5 Network for Example 5.1. All links have a capacity of 10 units. If all traffic is routed through the middle link (4,6), congestion occurs. If, instead, paths (I --> 3 --> 6) and (2 --> 5 --> 6) are used, the average delay is small. equal 10. Thus, the total throughput can be no more than 15 units. On the other hand, suppose that the traffic originating at node 2 is evenly split between the two paths 2 4 6 and 2 --> 5 --> 6, while the traffic originating at node I is routed along 1 --> 3 --> 6. Then, the traffic arrival rate on each link will not exceed 75% of capacity, the delay per packet will be reasonably small, and (given a good flow control scheme) no portion of the offered load will be rejected. Arguing similarly, it is seen that when the offered loads at nodes I and 2 are both large, the maximum total throughput that this network can accommodate is between 10 and 30 units, depending on the routing scheme. This example also illustrates that to achieve high throughput, the traffic of some origin-destination pairs may have to be divided among more than one route. In conclusion, the effect ofgood routing is to increase throughput/or the same \'alue of average delay per packet under high offered load conditions and to decrease average delay per packet under low and moderate offered load conditions. Furthermore, it is evident that the routing algorithm should be operated so as to keep average delay per packet as low as possible for any given level of offered load. While this is easier said than done, it provides a clear-cut objective that can be expressed mathematically and dealt with analytically. 5.1.2 Wide Area Network Routing: An Overview The purpose of this section is to survey current routing practice in wide area networks, to introduce classifications of different schemes, and to provide a context for the analysis presented later. There are a number of ways to classify routing algorithms. One way is to divide them into centralized and distributed. In centralized algorithms, all route choices are made at a central node, while in distributed algorithms, the computation of routes is
368 Routing in Data Networks Chap.5 5 units 15 units Origin Origin 5 All links have a capacity of TO units Figure 5.6 Network for Example 5.2.All links have a capacity of 10 units.The input traffic can be accommodated with multiple-path routing.but at least 5 units of Destination traffic must be rejected if a single-path routing is used. shared among the network nodes with information exchanged between them as necessary. Note that this classification relates mostly to the implementation of an algorithm,and that a centralized and a distributed routing algorithm may be equivalent at some level of mathematical abstraction. Another classification of routing algorithms relates to whether they change routes in response to the traffic input patterns.In static routing algorithms,the path used by the sessions of each origin-destination pair is fixed regardless of traffic conditions.It can only change in response to a link or node failure.This type of algorithm cannot achieve a high throughput under a broad variety of traffic input patterns.It is recommended for either very simple networks or for networks where efficiency is not essential.Most major packet networks use some form of adaptive routing where the paths used to route new traffic between origins and destinations change occasionally in response to congestion. The idea here is that congestion can build up in some part of the network due to changes in the statistics of the input traffic load.Then,the routing algorithm should try to change its routes and guide traffic around the point of congestion. There are many routing algorithms in use with different levels of sophistication and efficiency.This variety is partly due to historical reasons and partly due to the diversity of needs in different networks.In this section we provide a nonmathematical description of some routing techniques that are commonly used in practice.We illustrate these techniques in terms of the routing algorithms of three wide area networks (ARPANET, TYMNET,and SNA).The routing algorithm of another wide area network,the Codex network,will be described in Section 5.8,because this algorithm is better understood after studying optimal routing in Sections 5.5 to 5.7. Flooding and broadcasting During operation of a data network,it is often necessary to broadcast some information,that is,to send this information from an origin
368 5 units 15 units Routing in Data Networks Chap. 5 3 5 All Iinks have a capacity of TO units Figure 5.6 Network for Example 5.2. All links have a capacity of 10 units. The input traffic can be accommodated with multiple-path routing. but at least 5 units of traffic must be rejected if a single-path routing is used. shared among the network nodes with information exchanged between them as necessary. Note that this classification relates mostly to the implementation of an algorithm, and that a centralized and a distributed routing algorithm may be equivalent at some level of mathematical abstraction. Another classification of routing algorithms relates to whether they change routes in response to the traffic input patterns. In static routing algorithms, the path used by the sessions of each origin-destination pair is fixed regardless of traffic conditions. It can only change in response to a link or node failure. This type of algorithm cannot achieve a high throughput under a broad variety of traffic input patterns. It is recommended for either very simple networks or for networks where efficiency is not essential. Most major packet networks use some form of adaptive routing where the paths used to route new traffic between origins and destinations change occasionally in response to congestion. The idea here is that congestion can build up in some part of the network due to changes in the statistics of the input traffic load. Then, the routing algorithm should try to change its routes and guide traffic around the point of congestion. There are many routing algorithms in use with different levels of sophistication and efficiency. This variety is partly due to historical reasons and partly due to the diversity of needs in different networks. In this section we provide a nonmathematical description of some routing techniques that are commonly used in practice. We illustrate these techniques in terms of the routing algorithms of three wide area networks (ARPANET, TYMNET, and SNA). The routing algorithm of another wide area network, the Codex network, will be described in Section 5.8, because this algorithm is better understood after studying optimal routing in Sections 5.5 to 5.7. Flooding and broadcasting During operation of a data network, it is often necessary to broadcast some information, that is, to send this information from an origin
Sec.5.1 Introduction 369 node to all other nodes.An example is when there are changes in the network topology due to link failures and repairs,and these changes must be transmitted to the entire network.Broadcasting could also be used as a primitive form of routing packets from a single transmitter to a single receiver or,more generally,to a subset of receivers;this use is generally rather inefficient,but may be sensible because it is simple or because the receivers'locations within the network are unknown. A widely used broadcasting method,known as flooding,operates as follows.The origin node sends its information in the form of a packet to its neighbors (the nodes to which it is directly connected with a link).The neighbors relay it to their neighbors,and so on,until the packet reaches all nodes in the network.Two additional rules are also observed,which limit the number of packet transmissions.First,a node will not relay the packet back to the node from which the packet was obtained.Second,a node will transmit the packet to its neighbors at most once;this can be ensured by including on the packet the ID number of the origin node and a sequence number,which is incremented with each new packet issued by the origin node.By storing the highest sequence number received for each origin node,and by not relaying packets with sequence numbers that are less than or equal to the one stored,a node can avoid transmitting the same packet more than once on each of its incident links.Note that with these rules,links need not preserve the order of packet transmissions;the sequence numbers can be used to recognize the correct order.Figure 5.7(a)gives an example of flooding and illustrates that the total number of packet transmissions per packet broadcast lies between L and 2L,where L is the number of bidirectional links of the network.We note also that one can implement a flooding-like algorithm without using sequence numbers.This possibility is described in Section 5.3,where flooding and the topology broadcast problem are discussed in more detail. Another broadcasting method,based on the use of a spanning tree,is illustrated in Fig.5.7(b).A spanning tree is a connected subgraph of the network that includes all nodes and has no cycles (see a more detailed discussion in the next section).Broadcasting on a spanning tree is more communication-efficient than flooding.It requires a total of only N-I packet transmissions per packet broadcast,where N is the number of nodes. 0 0 Figure 5.7 Packet broadcasting from node A to all other nodes by using flooding (as in (a)]or a spanning tree [as in (b)].Arrows indicate packet transmissions at the times shown.Each packet transmission time is assumed to be one unit.Flooding requires at least as many packet transmissions as the spanning tree method and usually many more In this example,the time required for the broadcast packet to reach all nodes is the same for the two methods.In general,however.depending on the choice of spanning tree.the time required with flooding may be less than with the spanning tree method
Sec. 5.1 Introduction 369 node to all other nodes. An example is when there are changes in the network topology due to link failures and repairs, and these changes must be transmitted to the entire network. Broadcasting could also be used as a primitive form of routing packets from a single transmitter to a single receiver or, more generally, to a subset of receivers; this use is generally rather inefficient, but may be sensible because it is simple or because the receivers' locations within the network are unknown. A widely used broadcasting method, known as flooding, operates as follows. The origin node sends its information in the form of a packet to its neighbors (the nodes to which it is directly connected with a link). The neighbors relay it to their neighbors, and so on, until the packet reaches all nodes in the network. Two additional rules are also observed, which limit the number of packet transmissions. First, a node will not relay the packet back to the node from which the packet was obtained. Second, a node will transmit the packet to its neighbors at most once; this can be ensured by including on the packet the ID number of the origin node and a sequence number, which is incremented with each new packet issued by the origin node. By storing the highest sequence number received for each origin node, and by not relaying packets with sequence numbers that are less than or equal to the one stored, a node can avoid transmitting the same packet more than once on each of its incident links. Note that with these rules, links need not preserve the order of packet transmissions; the sequence numbers can be used to recognize the correct order. Figure 5.7(a) gives an example of flooding and illustrates that the total number of packet transmissions per packet broadcast lies between Land 2L, where L is the number of bidirectional links of the network. We note also that one can implement a flooding-like algorithm without using sequence numbers. This possibility is described in Section 5.3, where flooding and the topology broadcast problem are discussed in more detail. Another broadcasting method, based on the use of a spanning tree, is illustrated in Fig. 5.7(b). A spanning tree is a connected subgraph of the network that includes all nodes and has no cycles (see a more detailed discussion in the next section). Broadcasting on a spanning tree is more communication-efficient than flooding. It requires a total of only N - 1 packet transmissions per packet broadcast, where N is the number of nodes. A ---- 1 A ---- 1 Figure 5.7 Packet broadcasting from node A to all other nodes by using flooding [as in (a)] or a spanning tree [as in (b)]. Arrows indicate packet transmissions at the times shown. Each packet transmission time is assumed to be one unit. Flooding requires at least as many packet transmissions as the spanning tree method and usually many more. In this example, the time required for the broadcast packet to reach all nodes is the same for the two methods. In general, however, depending on the choice of spanning tree. the time required with flooding may be less than with the spanning tree method
370 Routing in Data Networks Chap.5 The price for this saving is the need to maintain and update the spanning tree in the face of topological changes. We note two more uses of spanning trees in broadcasting and routing.Given a spanning tree rooted at the origin node of the broadcast,it is possible to implement flooding as well as routing without the use of sequence numbers.The method is illustrated in Fig.5.8.Note that flooding can be used to construct a spanning tree rooted at a node, as shown in Fig.5.8.Also given a spanning tree,one can perform selective broadcasting, that is,packet transmission from a single origin to a limited set of destinations.For this it is necessary that each node knows which of its incident spanning tree links leads to any given destination (see Fig.5.9).The spanning tree can also be used for routing packets with a single destination and this leads to an important method for bridged local area networks;see the next subsection. Shortest path routing.Many practical routing algorithms are based on the notion of a shortest path between two nodes.Here,each communication link is assigned a positive number called its length.A link can have a different length in each direction. Each path (i.e.,a sequence of links)between two nodes has a length equal to the sum of the lengths of its links.(See Section 5.2 for more details.)A shortest path routing algorithm routes each packet along a minimum length (or shortest)path between the origin and destination nodes of the packet.The simplest possibility is for each link to have unit length,in which case a shortest path is simply a path with minimum number of links (also called a min-hop path).More generally,the length of a link may depend on its transmission capacity and its projected traffic load.The idea here is that a shortest Figure 5.8 Illustration of a method for broadcasting using a tree rooted at the broad- cast's origin node A.Each node knows its unique predecessor (or parent)on the tree, but need not know its successors.The tree can be used for routing packets to A from the other nodes using the paths of the tree in the reverse direction.It can also be used for flooding packets from A.The flooding rule for a node other than A is the following:a packet received from the parent is broadcast to all neighbors except the parent:all other packets are ignored.Thus in the figure,node D broadcasts the packet received from its parent B but ignores the packet received from C.Since only the packets transmitted on the spanning tree in the direction pointing away from the root are relayed further,there is no indefinite circulation of copies,and therefore,no need for sequence numbers. Flooding can also be used to construct the tree rooted at A.Node A starts the process by sending a packet to all its neighbors;the neighbors must send the packet to their neighbors,and so on.All nodes should mark the transmitter of the first packet they receive as their parent on the tree.The nodes should relay the packet to their neighbors only once (after they first receive the packet from their parent):all subsequent packet receptions should be ignored
370 Routing in Data Networks Chap. 5 The price for this saving is the need to maintain and update the spanning tree in the face of topological changes. We note two more uses of spanning trees in broadcasting and routing. Given a spanning tree rooted at the origin node of the broadcast, it is possible to implement flooding as well as routing without the use of sequence numbers. The method is illustrated in Fig. 5.8. Note that flooding can be used to construct a spanning tree rooted at a node, as shown in Fig. 5.8. Also given a spanning tree, one can perform selective broadcasting, that is, packet transmission from a single origin to a limited set of destinations. For this it is necessary that each node knows which of its incident spanning tree links leads to any given destination (see Fig. 5.9). The spanning tree can also be used for routing packets with a single destination and this leads to an important method for bridged local area networks; see the next subsection. Shortest path routing. Many practical routing algorithms are based on the notion of a shortest path between two nodes. Here, each communication link is assigned a positive number called its length. A link can have a different length in each direction. Each path (i.e., a sequence of links) between two nodes has a length equal to the sum of the lengths of its links. (See Section 5.2 for more details.) A shortest path routing algorithm routes each packet along a minimum length (or shortest) path between the origin and destination nodes of the packet. The simplest possibility is for each link to have unit length, in which case a shortest path is simply a path with minimum number of links (also called a min-hop path). More generally, the length of a link may depend on its transmission capacity and its projected traffic load. The idea here is that a shortest A Figure 5.8 Illustration of a method for broadcasting using a tree rooted at the broadcast's origin node A. Each node knows its unique predecessor (or parent) on the tree, but need not know its successors. The tree can be used for routing packets to A from the other nodes using the paths of the tree in the reverse direction. It can also be used for flooding packets from A. The flooding rule for a node other than A is the following: a packet received from the parent is broadcast to all neighbors except the parent; all other packets are ignored. Thus in the figure, node D broadcasts the packet received from its parent B but ignores the packet received from C. Since only the packets transmitted on the spanning tree in the direction pointing away from the root are relayed further, there is no indefinite circulation of copies, and therefore, no need for sequence numbers. Flooding can also be used to construct the tree rooted at A. Node A starts the process by sending a packet to all its neighbors; the neighbors must send the packet to their neighbors, and so on. All nodes should mark the transmitter of the first packet they receive as their parent on the tree. The nodes should relay the packet to their neighbors only once (after they first receive the packet from their parent); all subsequent packet receptions should be ignored
Sec.5.1 Introduction 371 E Figure 5.9 Illustration of sclective broadcasting using a spanning tree.We assume that each node knows which of the incident spanning tree links lead to any given destination.In this example.node A G wishes to broadcast a packet to nodes C. D.E.and F.Node A sends a copy of the CF packet to each of its incident finks that lead to intended destinations.and each copy carries in its header the ID numbers of its F intended destinations.Upon reception of a copy,a node strips the ID number from the header (if it is an intended destination)and repeats the process.This climinates some unnecessary transmissions.for example the transmission on link EG in the figure. path should contain relatively few and uncongested links,and therefore be desirable for routing. A more sophisticated alternative is to allow the length of each link to change over time and to depend on the prevailing congestion level of the link.Then a shortest path may adapt to temporary overloads and route packets around points of congestion.This idea is simple but also contains some hidden pitfalls,because by making link lengths de- pendent on congestion,we introduce a feedback effect between the routing algorithm and the traffic pattern within the network.It will be seen in Section 5.2.5 that this can result in undesirable oscillations.However,we will ignore this possibility for the time being. An important distributed algorithm for calculating shortest paths to a given desti- nation,known as the Bellman-Ford method,has the form Di:=minldi;+D.l where D:is the estimated shortest distance of node i to the destination and d,,is the length of link (j).Each node i executes periodically this iteration with the minimum taken over all of its neighbors j.Thus dij+D,may be viewed as the estimate of shortest distance from node i to the destination subject to the constraint of going through j,and minjdij+Di]may be viewed as the estimate of shortest distance from i to the destination going through the "best"neighbor. We discuss the Bellman-Ford algorithm in great detail in the next section.where we show that it terminates in a finite number of steps with the correct shortest distances under reasonable assumptions.In practice,the Bellman-Ford iteration can be implemented as an iterative process,that is,as a sequence of communications of the current value of D of nodes j to all their neighbors,followed by execution of the shortest distance estimate updates Di:=min,[d+Di].A remarkable fact is that this process is very flexible with respect to the choice of initial estimates D;and the ordering of communications and updates;it works correctly,finding the shortest distances in a finite number of steps,for an essentially arbitrary choice of initial conditions and for an arbitrary order of communications and updates.This allows an asynchronous,real-time distributed implementation of the Bellman-Ford method,which can tolerate changes of the link lengths as the algorithm executes (see Section 5.2.4)
Sec. 5.1 Introduction 371 8 CD--- E / / / / / A G /D " / / " / / " / / " / / " / C CD--- F Figure 5.9 Illustration of selective broadcasting using a spanning tree. We assume that each node knows which of the incident spanning tree links lead to any given destination. In this example. node A. wishes to broadcast a packet to nodes C. D, E, and F. Node A. sends a copy of the packet to each of its incident links that lead to intended destinations, and each copy carries in its header the ID numbers of its intended destinations. Upon reception of a copy, a node strips the lD number from the header (if it is an intended destination) and repeats the process. This eliminates some unnecessary transmissions. for example the transmission on link EO in the figure. path should contain relatively few and uncongested links, and therefore be desirable for routing. A more sophisticated alternative is to allow the length of each link to change over time and to depend on the prevailing congestion level of the link. Then a shortest path may adapt to temporary overloads and route packets around points of congestion. This idea is simple but also contains some hidden pitfalls, because by making link lengths dependent on congestion, we introduce a feedback effect between the routing algorithm and the traffic pattern within the network. It will be seen in Section 5.2.5 that this can result in undesirable oscillations. However, we will ignore this possibility for the time being. An important distributed algorithm for calculating shortest paths to a given destination, known as the Bellman-Ford method, has the form D i := min[di ] + D]1 ] where Di is the estimated shortest distance of node i to the destination and rilj is the length of link (i, j). Each node i executes periodically this iteration with the minimum taken over all of its neighbors j. Thus di ] + D j may be viewed as the estimate of shortest distance from node i to the destination subject to the constraint of going through j, and minjfdij + D j ] may be viewed as the estimate of shortest distance from i to the destination going through the "best" neighbor. We discuss the Bellman-Ford algorithm in great detail in the next section. where we show that it terminates in a finite number of steps with the correct shortest distances under reasonable assumptions. In practice, the Bellman-Ford iteration can be implemented as an iterative process, that is, as a sequence of communications of the current value of D j of nodes j to all their neighbors, followed by execution of the shortest distance estimate updates D i := minjfdij + D j ]. A remarkable fact is that this process is very flexible with respect to the choice of initial estimates Dj and the ordering of communications and updates; it works correctly, finding the shortest distances in a finite number of steps, for an essentially arbitrary choice of initial conditions and for an arbitrary order of communications and updates. This allows an asynchronous, real-time distributed implementation of the Bellman-Ford method, which can tolerate changes of the link lengths as the algorithm executes (see Section 5.2.4)
372 Routing in Data Networks Chap.5 Optimal routing.Shortest path routing has two drawbacks.First,it uses only one path per origin-destination pair,thereby potentially limiting the throughput of the network;see the examples of the preceding subsection.Second,its capability to adapt to changing traffic conditions is limited by its susceptibility to oscillations;this is due to the abrupt traffic shifts resulting when some of the shortest paths change due to changes in link lengths.Optimal routing,based on the optimization of an average delay- like measure of performance,can eliminate both of these disadvantages by splitting any origin-destination pair traffic at strategic points,and by shifting traffic gradually between alternative paths.The corresponding methodology is based on the sophisticated mathematical theory of optimal multicommodity flows,and is discussed in detail in Sections 5.4 to 5.7.Its application to the Codex network is described in Section 5.8. Hot potato (deflection)routing schemes.In networks where storage space at each node is limited,it may be important to modify the routing algorithm so as to minimize buffer overflow and the attendant loss of packets.The idea here is for nodes to get rid of their stored packets as quickly as possible,transmitting them on whatever link happens to be idle-not necessarily one that brings them closer to their destination. To provide an example of such a scheme,let us assume that all links of the communication network can be used simultaneously and in both directions,and that each packet carries a destination address and requires unit transmission time on every link.Assume also that packets are transmitted in slots of unit time duration,and that slots are synchronized so that their start and end are simultaneous at all links.In a typical routing scheme,each node,upon reception of a packet destined for a different node,uses table lookup to determine the next link on which the packet should be transmitted;we refer to this link as the assigned link of the packet.It is possible that more than one packet with the same assigned link is received by a node during a slot;then at most one of these packets can be transmitted by the node in the subsequent slot,and the remaining packets must be stored by the node in a queue.The storage requirement can be eliminated by modifying the routing scheme so that all these packets are transmitted in the next slot;one of them is transmitted on its assigned link,and the others are transmitted on some other links chosen,possibly at random,from the set of links that are not assigned to any packet received in the previous slot.It can be seen that with this modification,at any node with d incident links,there can be at most d packets received in any one slot,and out of these packets,the ones that are transient (are not destined for the node)will be transmitted along some link (not necessarily their assigned one)in the next slot.Therefore,assuming that at most d-k new packets are generated at a node in a slot where k:transient packets are to be transmitted,there will be no queueing, and the storage space at the node need not exceed 2d packets.A scheme of this type is used in the Connection Machine,a massively parallel computing system [Hil85].A variation is obtained when storage space for more than 2d packets is provided,and the transmission of packets on links other than the ones assigned to them is allowed only when the available storage space falls below a certain threshold. The type of method just described was suggested in [Bar64]under the name hot potato routing.It has also been known more recently as deflection routing.Its drawback is that successive packets of the same origin-destination pair may be received out of
372 Routing in Data Networks Chap. 5 Optimal routing. Shortest path routing has two drawbacks. First, it uses only one path per origin-destination pair, thereby potentially limiting the throughput of the network; see the examples of the preceding subsection. Second, its capability to adapt to changing traffic conditions is limited by its susceptibility to oscillations; this is due to the abrupt traffic shifts resulting when some of the shortest paths change due to changes in link lengths. Optimal routing, based on the optimization of an average delaylike measure of perfonnance, can eliminate both of these disadvantages by splitting any origin-destination pair traffic at strategic points, and by shifting traffic gradually between alternative paths. The corresponding methodology is based on the sophisticated mathematical theory of optimal multicommodity flows, and is discussed in detail in Sections 5.4 to 5.7. Its application to the Codex network is described in Section 5.8. Hot potato (deflection) routing schemes. In networks where storage space at each node is limited, it may be important to modify the routing algorithm so as to minimize buffer overflow and the attendant loss of packets. The idea here is for nodes to get rid of their stored packets as quickly as possible, transmitting them on whatever link happens to be idle-not necessarily one that brings them closer to their destination. To provide an example of such a scheme, let us assume that all links of the communication network can be used simultaneously and in both directions, and that each packet carries a destination address and requires unit transmission time on every link. Assume also that packets are transmitted in slots of unit time duration, and that slots are synchronized so that their start and end are simultaneous at all links. In a typical routing scheme, each node, upon reception of a packet destined for a different node, uses table lookup to detennine the next link on which the packet should be transmitted; we refer to this link as the assigned link of the packet. It is possible that more than one packet with the same assigned link is received by a node during a slot; then at most one of these packets can be transmitted by the node in the subsequent slot, and the remaining packets must be stored by the node in a queue. The storage requirement can be eliminated by modifying the routing scheme so that all these packets are transmitted in the next slot; one of them is tnmsmitted on its assigned link, and the others are transmitted on some other links chosen, possibly at random, from the set of links that are not assigned to any packet received in the previous slot. It can be seen that with this modification, at any node with d incident links, there can be at most d packets received in anyone slot, and out of these packets, the ones that are transient (are not destined for the node) will be transmitted along some link (not necessarily their assigned one) in the next slot. Therefore, assuming that at most d - k new packets are generated at a node in a slot where k transient packets are to be transmitted, there will be no queueing, and the storage space at the node need not exceed 2d packets. A scheme of this type is used in the Connection Machine, a massively parallel computing system [HiI85]. A variation is obtained when storage space for more than 2d packets is provided, and the transmission of packets on links other than the ones assigned to them is allowed only when the available storage space falls below a certain threshold. The type of method just described was suggested in [Bar64] under the name hot potato routing. It has also been known more recently as deflection routing. Its drawback is that successive packets of the same origin-destination pair may be received out of