IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS VOL. 13. NO. 7. SEPTEMBER 1995 Fundamental Design Issues for the Future Internet Scott Shenker, Member, IEEE (Invited Paper) aches of modifying it (in Section), we present aluating network designs(in Section IIl). We et should adopt a new service ), how this service model should be is service model should s for new o review We do THERE a continue to gro to link a s ss, why chitecture much to e cts of the substantia has a regula featured the tisements oest-effort"service; small cor in terms the network hen, or even if, packets will invested, dominated by of our te (e.g., Telnet), choice DNS), and concern of nd packet decisions nomi future telec This paper discuss decisions that the Inte collapse. future. After briefly re ia appli- d video in M was supported in part by th networks by Fort Huachuca under C The author is with the Xerox Palo Alto Res rvice for 94304-1314usa. IEEE Log Number 9413108. b is typically not done in the applica- The of hosts increased by 81% in 1994 alone. Thes protocol TCP. See [25] for discussion w 41]and241 0733-8716/95504.001995e
SHENKER: FLNDAMENTAL DESIGN ISSUES FOR THE FL TCRE INTERNET that connect these computers, such as the Internet, must adaptation techniques been seen as a general and enduring be prepared to cope with the traffic emanating from solution have very different characteristics and requirements than data ture"approach has several important advantages. No changes applications, and thus their emergence is likely to significantly are required to any network interfaces, so the change can alter the nature of the Internet's traffic load (see [38] for a be incrementally deployed at both the end hosts and the more complete discussion ). In particular, as traditionally im- routers. Also, the network mechanisms(Fair Queueing an plemented these real-time applications are typically less elas- its relatives) and application mechanisms(delay adaptation tic-less tolerant of delay variations--than data applications. are relatively well understood. However. in this approach This lack of elasticity causes two problems. First, these the network would deliver the same class of service to all traditional implementations* of real-time applications do not users, with no assurances as to the quality of that service erform adequately when running over the current Internet While the network would protect users from each other, it because the variations in delay are too extreme and there is up to applications to adjust to the inevitable variations in are too many dropped packets. Second, these applications packet delay and available bandwidth. There are likely to be typically do not back off in the presence of congestion; when limitations to this adaptability. Moreover, because there is no these real-time applications are contending for bandwidth with admission control the network must be provisioned so that traditional data applications, the data applications end up the fair bandwidth shares are not, except in very rare case eceiving very little bandwidth. Thus, when deployed in the unreasonably small urrent Internet, traditional real-time applications not only do As we will explore in this paper. it is not clear that not always perform adequately but they also often interfere such an approach is desirable; there are other approaches to with the data applications supporting real-time applications that modify the basic Internet One can address these problems without changing the architecture. These modifications usually involve extending the basic Internet architecture by improving application and router Internet 's service model-the set of delivery services-from implementations, The unfairness that results when congestion- the single class of best-effort service to include a wider var avoidant data applications compete with congestion-ignorant of service classes. In addition, one can augment this service al-time applications can be resolved by using the Fair model with admission control, which is the ability to turn Queueing packet scheduling algorithm (or something roughly some flows away when the network is overloaded. In this quivalent) in routers. These routers would then ensure that paper we ask whether or not these architectural modifications every user had access to theirfair share "of bandwidth, and are appropriate. Surprisingly, this question has received rather so data applications would be protected from the real-time little explicit attention in the literature. The approach which ones(see 16l for a discussion of such scheduling algorithms ) involves only implementation enhancements and not archite Such a modification does not require any change to the tural modification, while advocated by several researchers, has Internet architecture. While this approach solves the second not been adequately described in the literature. The approach problem mentioned above it does not solve the first; the which entail major architectural changes have been fully service delivered to an individual application can still have described, and we have included a sampling of this literature substantial delay variance and packet loss, seriously degrading in the bibliography. Most of these papers implicitly assume the performance of these traditional implementations of real- that such modifications are necessary and hence give little justification for the basic approach(this author also pleads One can address this problem by modifying the application guilty to this crime; see 131); instead the focus in these papers is mplementations rather than the network implementation. In on the details of the design and comparison with other similar recent years there have been tremendous advances, much born architectures tive to variations in packet delays: delay adaptive techniques basic issues in what has been an underarticulated disagreement have been highly successful in such Internet applications as nv about basic architectural assumptions. We hope to provide and vat. These delay adaptive techniques were first introduced a framework for considering the various architectural trade many years ago. see [71, 1441. but only recently have these offs and identify some of the critical assumptions that lead to one choice or another. We do this by presenting a rather Both or we oints(see/3. ons of real-time applican back of incoming data. approach is not intended to model reality, but to be the simplest involve either fixed abstract formulation and a few rather simple models; this under high delay variance. s the real issues at stake. Central sThis approach does not require an architectural modification although this endeavor are questions about the nature of future computer pplications, the cost of additional network mechanism, the m购m四m(m时甲 discuss rate adaptation again in Section晚 I and we will not discuss the to the current delays and can tolerate fairly large variations in delay without We will use the term"flow"to refer to the traffic stream representing a particular user or application. Flows can be unicast or mu
EEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 13, NO. 7 SEPTEMBER I99 amenable to precise analysis. Consequently, our discussion if different applications have widely different sensitivities to here is nonrigorous and intuitive; our goal is to articulate these delay, then offering two priority classes will likely increase issues, not provide analytical resolution of them. Of course, the efficacy of the network(when compared to a single class all of this analysis must start by asking what is the goal of of service) We can illustrate this by the following simple example, where we compare providing two priority classes with a single class of FiFo service. Consider a network with a single link IIL. WHAT IS THE GOAL OF NETWORK DESIGN By what criteria do we evaluate a particular networ modeled by an exponential server (of rate u= 1)and flow modeled by Poisson arrival processes. Consider two types of chitecture?The Internet was designed to meet the needs of network clients, with Poisson arrival rates r=0.25 and with users, and so any evaluative criteria must reduce, in essence, U,= 4-2d, and U2=4-d2 where d; represents the make the users? Network performance must not be measured average queueing delay delivered to client i. Thus, we have to the following question: how happy packets, or power, but rather should be evaluated solely in service in the network, ther ivities to delay. If we use FIFO in terms of network-centric qu like utilization, dropped erms of the degree to which the network satisfies the service requirements of each users applications. For instance, if particular application cares more about throughput than delay, and so V FIFO = 2. If we use strict priority service, with or vice-versa, the network service to that application should preemption, and give client I priority, be evaluated accordingly We can formalize such a notion of network performance d 4/3 as follows. Let the vector si describe the service delivered to he i'th application or s: contains all relevant measure ( delay, throughput, packet drops, etc )of the delivered service. and into the performance of the application; increased U. reflects improved application performance. The utility function de priority =8/3. Thus, the strict priority scheduling scribes how the performance of an application depends on the algorithm is more efficient delivers a higher value of V at the delivered service. 0 Later, in Section VI, we will discuss the same bandwidth--than FIFO. In fact, when compared to all shapes of these utility functions for some common applic a algorithm gives the most efficient feasible allocation of delay classes but for now we introduce them merely for definiti possible scheduling algorithms, the strict priority scheduling The goal of network design is to maximize the performance Note that in this example a slightly overprovisioned FIFO be restated as being, quite simply, to maximize the sum of the network(u= 17/16)has the same value of V as the priority ity of an architecture V=2, U:(;). Much of our subsequent examples, where the delay preferences are more varied (e.g analysis bears on how various design choices affect V epending on higher moments of the delay or having a"k the utility function at some value of delay), a much greater led to make a FIFO network IV. WHY DO WE NEED TO EXTEND THE SERVICE MODEL? match the efficacy of a priority network. ways increase the efficacy of an architecture by l other things being equal, one should definitely offer supplying more bandwidth; faster speeds mean lower delays more varied set of services than just the single class of best- and fewer packet losses, and therefore higher utility values. effort service; matching services to application needs enables However, one can also increase V by keeping the bandwidth the network to increase the overall utility. However, there is fixed and delivering a wider variety of services than just a a trade- off between the cost of adding bandwidth and the cost ingle class of best-effort service. Such an extension in the of adding the extra mechanism needed to extend the service service model, which is the set of services offered by the model. Both of these costs are impossible to gauge precisely etwork, is a fundamental aspect of many of the recently The mechanistic aspects are costly not just in control overhead proposed network architectures. Such an extension would and in the extra complexity required in network components, allow different applications to get different qualities of service: but also in the disruption caused when changing such a basic the general intuition is that the closer the services are aligned aspect of the architecture(later in this section we will explore with application needs, the higher the efficacy. For instance, some of the implications of these changes). The cost of the ith the performance of bandwidth will depend greatly on the nature of the competitive and regulatory environ well as future logical heir applications and will use the terms application and user somew I Recall that the ay if we have two priori ral rates idered n imea uke r and r2 respectively. then the delays are given by d, =tim an of how much a user would be
SHENKER: FNDAMENTAL DESIGN ISSUES FOR THE FL'TLRE INTERNET developments.none of which we can accurately foretell. the least efficient network design is to build separate networks Moreover, the amount of bandwidth needed to offset the each with a different application class, and that describes th benefits of extending the service model depends in detail on the current situation utility functions of applications and the service model being Our analysis of this simple example also reveals two other offered. Evaluating this trade-off requires making judgements important points about extending the service model. First egarding future developments about which little is known and not every client gains directly from the increase in efficacy vary widely compare two service allocations and Nonetheless, despite this uncertainty, at the core of this v(s)>v(5)does not imply that U (s, )>Ui(8? )for all bandwidth versus mechanism trade-off is the central fact i, For instance, in the simple example with just two clients, that the timescales of the service requirements of real-time U2 F0=2 but Up ority= 4/3 even though Priority applications are much smaller, by several orders of magnitude, VFIFo. Efficacy in heterogeneous networks is gained by han those of, say, FAX or electronic mail. One cannot operate shifting resources from applications that are not extremely a network at reasonable utilizations while delivering to all performance-sensitive to those that are: the performance traffic a service suitable for real-time applications: yet the sensitive clients gain from offering more sophisticated service extreme elasticity of FAX and e-mail would be able to utilize models, but the less-performance sensitive clients lose; the the significant amount of leftover bandwidth if the service overall efficacy increases because the losses are smaller than model could just keep it out of the way of the real-time traffic. the gains. For this reason alone it seems plausible, if not probable, that Second, in order to achieve the additional efficacy with an the payoff in terms of the bandwidth saved from offering these extended service model, the mapping between service classes multiple classes of service will more than outweigh the costs and applications must reflect the application requirements. In of the extra mechanism our simple example, the increased efficacy of priority service Another alternative is to use separate networks for can only be realized if the network can recognize which client various applications, each with its own single service is more delay-sensitive and assign it the appropriate service his is similar to what we do now with separate telephony, class. Assigning flow I to the lower priority class and fiow 2 video, and data network infrastructures. We can use our to the higher priority class would result in a lower value for imple example to examine this possibility. If we double the V: in fact, it yields V= 4/ 3(which is 1/2 of the optim linespeed. /=2, and double the number of applications, value ). We will return to these two points in Section V. the performance numbers become: I'prorty 32 /3 and 10, Now consider two networks, with separ andwidths /(1 + 2=2. each carrying two applications. If we V. WHO CHOOSES THE SERVICE FOR A FLOW divide the clients so that each network carries one applicatio At this point, we have argued that the network should offer of each type. and the bandwidths are split evenly (which is a service model that includes more than just the single class optimal here). then we revert back to our original case and the of best-effort service This service model could be as simple most efficient thing is to use priority on each link and achieve as two priority levels, or as complicated as the multiple delay a total efficacy of 16/3. If we partition the clients so that one bounded service classes described in [3]. However, we have network carries two delay-sensitive applications and the other yet to address one fundamental question which applies to all carries the two less sensitive applications, then the optimal of these possible service models: how does the architecture arrangement is to use FIFO on each network and to split the decide which service to give a particular flow? There are bandwidth. as /lr 086 and /2=V2-1/2 essentially two possible answers to this question: the flow car .9I4. which yields an efficacy of 10-lv2 2 4. 313. Thus, pick the service, or the network can pick the service. We now combining the networks into a single infrastructure yields a contrast these two options. I-4 much higher value for v than using separate networks: in fact. in this simple example the value of V is doubled by combining A."ImplicitlY Supplied"versus"Explicitly Requested the networks. This increased efficiency due to statistically sharing a resource is one of the central design principles of If the network chooses the service class, then we say that the data networks. When considering separate networks, there is service is implicitly supplied; the application sends its packets greater efficacy when the application types are mixed. Thus, without saying anything about its service requirements and the network then classifies these ts into some service class the near and handles them accordingly. For instance, the network might oubiful that. when considering the network as a whole bandwidth will be divide all traffic into the categories of asynchronous bulk mportant to note that if indeed interactive bulk, interactive burst, and real-tim port number. and then deliver to ace category ag For instance, it might then be preferable to mere add bandwidth end the service rnodel appropniate service Using the formulas for delay +{8 throughout this paper. in this section the larceny ially egregious The od by which the categorization is done is not important and the fact that m +g2=2. we can solve for the optimal value of AI and here. we care only that the categorization is under the control of the network and not the applicatio
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS. VOL 13. NO. 7, SEPTEMBER 1995 This approach has some important advantages. Most no. This clean separation is markedly different from the telephony tably, this approach does not require any change to the and cable infrastructures, which are more focused on a single service interface, Curren ly, applications can jus send their application and a uniform network technology. Experience packets without any negotiation with the network, and this suggests that violating the clean separation embodied in the will continue to be true with an extended service model IP layer by embedding application information in the network if the service is supplied implicitly; applications would not would inhibit the creation of new applications. Thus, the need to specify their desired service to the network, and the implicit approach, while certainly attractive in the short-term, network would, in turn, not need to describe the service- has serious long-term deficiencies to-be-delivered to the application. Also, since there is no The alternative is to have applications explicitly request the explicit commitment to a given service level, the mapping service they desire. That is, the network offers a set of service from application to service class, and the nature of the service classes and the applications indicate to the network which delivered to each service class, need not be uniform across service class they want. This approach has the advantage that all routers nor stable over time. Since applications would not it maintains the clean architectural interface between applica have to change, and the service given by the routers would tions and networks, so any future application can be serviced not have to be standardized, such an architecture could be in the desired service class. However, this approach does have deployed immediately and then modified incrementally ome unfortunate disadvantages. The first disadvantage is the Weighing against these powerful advantages are some prac- incentive issues that are raised, and the second is the lack of tical disadvantages. The implicit approach entails a fixed set flexibility in the service model. We address these two problems application classes: at any given time a router only knows separately about a certain set of applications and cannot properly service those applications about which it does not know. Moreover, this approach cannot accommodate individual or situational B. Incentives variations within a single application, For instance, if one Recall that in Section Iv we made two observations. First uses the same audio or video application for both interactive we observed that the optimal efficacythe maximal V--was (which needs low delay and low jitter)and lecture(which can only achieved when low priority service was given to the flow olerate large delays and jitter) settings, then there is no way with less stringent delay requirements. Second, we observed for the network to distinguish between these two cases: this that not all applications benefit directly from this increase in V leads to inefficiencies because the network must give service those asking for lower priority service received worse service appropriate for the interactive mode and thus the application In the explicit approach, this mapping betweenbetween the ap- gets better service than it needs when it is being used in lecture plication and service class is under control of the applications themselves and therefore ultimately under control of the user These two practical problems are symptoms of a fairly We can achieve optimal efficacy only if some applications ask mode. basic architectural flaw in the implicit approach. The implicit for lower quality service. What will motivate users to do this: approach can only work if the network knows something about why won always ask for the highest quality service the service requirements of each application. New applications no matter what their application requirements? 18Certainly cannot get the service they require if the network does not when the Internet had a small user population with a strong know what their service requirements are. Embedding appli- sense of tradition and comraderie, informal social conventions cation information into the network layer information violates would have been sufficient to induce users to behave properly he of the central design tenets of the Internet. The inter- However, in the public Internet of the future, as with ar network layer-the IP layer[351-was designed to provide a heavily utilized public facility, informal social conventions clean interface between networks and applications The fact will not be sufficient to discourage selfish behavior. Thus, that any application capable of running on top of IP could the network must provide some other system of incentives run on any network that supported IP encouraged diversity to encourage to request the proper service classes for both above and below the IP layer. a wide variety of net- their applications. Pricing of network services is one approach working(or, more correctly, subnetworking)technologies have Charging more for the higher quality service will ensure that emerged (e.g Ethernet, Token Ring, ATM, etc. )and a wide only the extremely performance-sensitive applications will and ever-increasing variety of applications have flourished. 7 request that service. As discussed in 14], [37], pricing can be sed to spread the benefits of increasing V to all applications user: is mucsauational variation. An individual variation would be for some applications, the reduction in the quality of service lay or jitter than another user is compensated by the reduction in price, and for others the re rather broad then this is unlikely to be that there es that the service request made by the general fashion: service requests are embedded into the application itself (much as the can be used as quality of service signals n the shor -term, is unlikely to be adopted as an architectural principle
SHENKER: FUNDAMENTAL DESIGN ISSUES FOR THE FUTURE INTERNET increase in price is compensated by the increase in the quality reservations for real-time applications, then usage pricing could be centered on the high quality real-time services most users of the Internet do not have to worry and not applied, at least in the near term, to the lower about their usage of the network incurring additional costs quality services. Moreover, much of the authentication and (although users connected to the Internet through a public accounting infrastructure for this charging could be added access provider sometimes do have usage-based charging). The along with the reservation mechanism, and so the best-effort introduction of pricing, especially usage-based pricing, into the architecture could be left relatively intact. This would leave met will involve major changes in both design and culture. undisturbed the cultural aspects of the current best-effort While most popular user interfaces hide the details of network Internet while charging a premium for high quality video and activity, once charging is in place the user interface will audio connection probably need to reveal the costs to the user. Also, the basic Clearly, these incentive issues are extremely important and network architecture must incorporate sufficient capabilities many issues remain unresolved. See [4], [8]. [30], [37] for to do the requisite authentication, accounting, and billing. further discussion of pricing and incentive issues Perhaps most importantly, such charging could alter the"gi economy"and browsing mentality that exists in the Internet day, The Internet is such an exciting development largely C. stability ecause of the cornucopia of information and resources freely available; unfortunately, once there are charges for network The other implication of the explicit approach is that the usage users will be less likely to disseminate information network service offerings must be known to applications. 20 widely. Moreover, many users spend hours browsing through Applications must know the set of services in order to ask for the Internet, much as one browses through a bookstore: if the service, and they must know the characterization of the delivered service-if any-in order to decide which servic take books off of a shelf in a bookstore, such browsing would best meets their needs. Since knowledge of the service model be seriously curtailed Thus it appears that neither the user must be embedded in applications, the service model must interfaces nor the basic architecture are ready to support such remain stable. though extensible. That is, it can be extended pricing, and it is also not clear that the current Intemet culture further but the services that are already in place cannot be could survive its introduction easily altered because this would interfere with the installed Do these dire consequences mean that we should not extend application base the service model? This is certainly a debatable point, and one The service model serves as th n of network that should be debated more extensively than it has. It should service that applications can be programmed against.Because be noted that while extending the service model raises the of this, it is the most fundamental, and most stable, aspect of issues of incentives when deciding how to send data. even the network architecture. The underlying network technologies in the case of a single class of best-effort service we nee can change, and even IP can change, without disturbing address the incentives of whether to send data. Currently, of the service model is both an advantage and a disadvantage ine, but rough adherence to rules of etiquette, along with It is an advantage because the service model then provides equate provisioning, have kept the Internet in relatively good a useful abstraction of network service. It is a disadvantage shape. However, once"junk mailing"becomes commonplace because if the initial service offerings are not well designed and automatic browsers(agents automatically browsing the it is much more painful and disruptive to change them; in Internet and retrieving anything that looks interesting) the implicit approach, such incremental adjustments would be widely deployed, the Internet will suffer. Even in the absence In the explicit approach, the service model is not to discourage or at least prioritize use: MacKie-Mason and it is uniform. That is, there is single IP service Thus, while usage. opposed to the implicit case where each router co based pricing may have undesirable consequences, we nee different set of service offerings. This uniformity imposes a to confront its existence regardless of our decisions about standardization requirements on network routers and subnets extending the service model of ter and subnet standards. call them network We should also note that there are pricing mechanisms element requirements, must be developed so that any con which would have less of a negative impact, and tha catenation of routers and subnets obeying these standards ca re incentive schemes which do not rely primarily on pricing. support the end-to-end service offering advertised by IP.Note For instance, quotas could be applied to an institution at its hat the service model is a set of end-to-end services and it access point to the Internet, and then the issue of allocating is up to the network to ensure that the services delivered at becomes a local problem for that intitution which,in uld d through informal social conventions, Also, if the network offered 20 Since one could have his additional complication does not alter the points we are making
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOI. 13, NO. 7 SEPTEMBER 1995 each link along the data path combine to support the offered department is something over which you have no control. 4 This kind of service is indeed implicitly supplied. Because One of the aspects of the Internet architecture that con- it is not designed to satisfy the detailed quality of service cess is its ability to accommodate require of the individual users, 25 but rather is intended nology can support IP, because IP does not require any this service is not subject to the arguments we made above fy ut does not mandate when or even if packets arrive. If the implicit services can also be used as a form of network man- service model is extended in any nontrivial manner, and all agement, for example dividing bandwidth between different outers were required to support all services in the service protocol families. Because organizational and other collective odel, then we could end up in a situation where many service requirements are important, we expect that link-sharing subnet technologies(e.g, Ethernets) could no longer support and other implicitly supplied services will eventually play IP; this would seriously hinder the extension of IP connectivity an important role in the future Internet architecture. These and fragment the Intemet. This eventua can be avoided implicitly supplied services can be incrementally developed by allowing routers to support subsets of the service model. and deployed, and so it is not as crucial that these parts of the The definition of the service model is still uniform, in that architecture be immediately addressed every router agrees on the definitions of the various services, but the deployment of the various services is not necessarily VI. DO WE NEED ADMISSION CONTROL uniform. 22 In this way, we can retain the property that virtually all subnet technologies can support IP and thus widespread IP We have argued that the Internet should extend its service connectivity can be maintained More demanding services can model, but we have not yet discussed what services should be be gradually deployed as the Internet infrastructure is slowly added. Some services, like best-effort priority levels, do not upgraded carry any quantitative characterization of delay or bandwidth However, the uniformity of the service model does mean and allow all traffic to be admitted much like in today's Inter- subnet technologies (e. g, 802.X, ATM, etc. )must come to- characterizations. Such quantitative services require explicit work element requirements. While not a technical must turn away additional flows when admitting them would this is indeed a severe organizational and political lead to violating its quantitative service commitments. Thus, considering the vastly different design traditions and a key architectural decision to make is: should the network assumptions in the various communities. 3 ever deny access to flows? In keeping with our assertion in Section Ill, the answer depends on which choice maximizes efficacy V of the net D. Link-Sharing and Other Implicit Services Admission control is typically used to prevent the network from becoming overloaded. One can also prevent overload- Our analysis above, which focused on the service by individual applications, suggests that the servic are excep- ing by overprovisioning the network. We first discuss what erloading means in the context of our utility function requested explicitly by applications. However, there formulation, and then we address the extent to which overpro tions to this general rule because there are some cases where visioning could be used as a substitute for admission contr we are more concerned with the service given to an aggregate of flows. For instance, a company may want to ensure that its aggregate traffic always has access to a certain amount of A. Overloading bandwidth along a path between two different locations. Or a We typically think a network is overloaded if the delays are university may want to make sure that bandwidth on the access large and packets are being dropped. However, this perspecti line to the Internet is evenly split between various departments. only addresses the service seen by individual call One can use packet scheduling algorithms to accomplish this that our goal is to maximize the sum of the utilities. While The point here is that one cannot request access to these the utility of the existing flows, the question we need to ask 26 Thus. one can define a network ses, like priority service, the ma between the end-to-end An individual I to which one to associate her trath elay service. this mapping is much more difficult. Sec [40) for a but that is different than bein arbitrarily assert membership in any services cannot, by themselves, provide In [40] we discuss ways in which the because of the effect on later admissio npetitors to IP, like IPX, DECnet, and control decisions, but we do not address that level of complexity in our TM also need to come together and agree on a service model
SHENKER: FUNDAMENTAL DESIGN ISSUES FOR THE FUTURE INTERNET be overloaded if the value of v goes up when one or more flows are removed. 27 Elastic us consider the following Gedanken experiment, in which an unlimited set of potential network users each use the same application on a network with a single congested link ith bandwidth B. Since all applications are identical, and there is a single link, the service delivered to each application is merely a function of the bandwidth share allocated to each ser;thus, we can use the simplifying notation ((D). we are interested in the behavior of the function V(n)=nU(B) The question of admission control, which is often mired in Bandwidth ideological differences, here reduces to the following simple mathematical question: for which value of n is the functio Fig. I. Utility (performance) of an clastic application as a function of V(n) maximized? If v(n)is always increasing and takes s maximal value at n=x then, by our definition above, the network is never overloaded and we need not include admission control in the architecture. If v(n) is maximized at some finite n then the network can overload: one then needs decide whether to use admission control as a method of avoiding such overloads Assuming that ((r) is a nondecreasing function, we can make the following two statements about V(n). First, if there xists some e>0 such that the function ((ar) is convex but not concave (i. e, not linear) in the neighborhood (0.e),then there exists some n such that V(n)> V(n) for all n >nIn this case, the network is overloaded whenever n >n. Second if the function U(r)is everywhere strictly concave, then Fig. 2. Utility (performance)of a hard real-time application as a function L(n) is a strictly monotonically increasing function of n; in this case, the network is never overloaded. For example, if U(r)=rP then V(n)=nU(B)n-p. For p>1, v(n) original design choice of best-effort service for the Internets is maximized at n= I whereas for p< 1, V(n)is maximized architecture at n=x. Thus, the issue of overloading depends in detail on At the other extreme of delay sensitivity are applications the shape of the utility curves. We now discuss a few sample with hard real-time requirements. These applications need their application classes and their utility functions data to arrive within a given delay bound; the application Recall that for our simple Gedanken experiment we are does not care if packets arrive earlier, but the application describing the service solely in terms of the bandwidth share. performs very badly if packets arrive later than this bound The functions Ui are then functions of a single variable and Examples of such applications are link emulation, traditional can be more easily analyzed. Our question is: what do typical telephony, and other applications that expect circuit-switched utility functions look like? Since there is little hard evidence service. For applications with hard real-time requirements, the for their exact shape (see [451), we will only conjecture about utility curves look like the one their qualitative properties while the delay bounds are being met the application Traditional data applications like file transfer, electronic performance is constant, but as soon as the bandwidth share mail, and remote terminal are rather tolerant of delays. On drops below that needed to meet the required delay bounds. an intuitive level they also would appear to have decreasing the performance falls sharply to zero. A system with such marginal improvement due to incremental increases in band- applications becomes overloaded as soon as the bandwidth width. We will call such applications elastic applications, and share falls below the critical level, Applications with hard real their utility functions look something like that in Fig. 1 time requirements would function much better in a network Here there is a diminishing marginal rate of performance using admission control to ensure that the bandwidth shares enhancement as bandwidth is increased so the function is never fell below the critical level strictly concave everywhere. V is always maximized when Traditionally video and audio applications have been de no users are denied access. For this class of applications, signed with hard real-time requirements. However, as the admission control has no role. This analysis reaffirms the current experiments on the Internet have dramatically shown most audio and video applications can be implemented to rather tolerant of occasional delay bound violations and A network that is not overloaded according to our technical definition dropped packets. However, such applications have an intrinsic is, even though the network bandwidth requirement because the data generation rate does not address the question independent of the network congestion. Thus, the performance of what loading levels offer the best cost/ performance trade degrades badly as soon as the hare becomes
TEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL, 13, NO. 7, SEPTEMBER 1995 Delay-Adaptiv Rate-Adaptive Fig. 3. Utility(performance)of a delay-adaptive real-time application as a Fig. 4. Utility (performance)of a rate-adaptive real-time application as a function of bandwidth maller than the intrinsic generation rate. For delay-adaptive the signal quality is much better than humans need. It also audio and video applications, the utility function curves might appears that at very small bandwidths, the marginal utility is look something like Fig. 3 very slight because the signal quality is bly low(se Note that the drop-off in performance is not nearly so sharp [45] for some human-factors studies). The curves take the as with hard real-time applications, but the general shape of shape in Fig. 4 the curve is very similar. In particular, this utility function Similar to the regular video utility functions, these utility convex but not concave in a neighborhood around zero. Thus, functions are convex but not concave in a neighborhood the network can become overloaded with such applications; around zero so the network can become overloaded with the exact location of the overloading point will depend on the such applications. However, the overloading point is much particular shape of the utility curve around the inflection point. smaller than those of regular delay-adaptive applications. The This analysis suggests why the Internet community and overloading point of delay-adaptive applications is tied to th the telephony community have been at an impasse for years bandwidth consumption in the normal case, whereas the over the issue of best-effort versus real-time service. The loading point of rate-adaptive applications is tied to the band- Internet community built a network to support data transfer width consumption of the minimally acceptable signal quality applications that fall into the elastic category. For them, We should note that our simple Gedanken experiment can the decision to offer best-effort service was the natural and be generalized to mixtures of different types of applications correct one. It is only now that video and other real-time with the same conclusions; the treatment is complicated by applications are being widely used on the Internet, and with the need to describe the relative allocation of bandwidth to heir appearance are coming calls for admission control the various different applications, but the general conclusion Similarly, the telephony community built a network around that the overloading criterion depends on the curvature of the an application (voice)with hard real-time requirements. When utility functions remains unchanged he community designed ATM to service these applications, This Gedanken experiment suggests that for a network the decision to offer real-time service and use admission with only traditional data applications, efficacy is maximized control was again the natural and correct choice. Now that by accepting all flows. However, when there are real-time data services are being contemplated for ATM networks, the applications, whether hard real-time, or delay-adaptive, or even idea that ATM should offer best-effort has properly arisen. 28 rate-adaptive, then efficacy is maximized when some flows are There is another class of real-time applications. Rate- turned away. We now address the question: should one build on rate 29 in an architecture that includes admission control, or should one response to network congestion. This adjustment keeps the overprovision the network so that overloading rarely occurs delays moderate no matter what the bandwidth share. Thus, the performance of the application depends completely on the quality of the signal. Certainly at high bandwidths the B. Overprovisioning marginal utility of additional bandwidth is very slight because If one could cost-effectively overprovision the network so that under natural conditions the offered load rarely exceeded applications were made the overloading point, then one might choose to do that rather an include admission control in the architecture we dont Whether or not the adjustment is actually done at the source, or done by require that overloading never occurs, we merely require that be the only reasonable design choice for multicast fows. The only ehs may ume between failures on that link; then, overloading is just y essential another failure mode. We are not asking if there are individual available bandwidth. In addition, we are assuming that all rate-adaptive links that can be overprovisioned; undoubtedly there are applications are also delay -adapti Rather. since the IP architecture is uniform, we are asking
SHENKER: FLNDAMENTAL DESIGN ISSUES FOR THE FLTURE INTERNET if it is cost-effective to overprovision the entire network. a cost-effective solution. 0 To see this more clearly, we will This question can be addressed different tin make a somewhat artificial distinction between normal usage short -term and long-term. The analysis of both cases nd leading edge usage on speculations about future developments and is The term normal usage will refer to those flows of moderate bandwidth; lets say something less than about 1 Mb/s. Thi In the short-term say in the next five years or so, demand will encompass digital telephony, relatively low quality video for high-bandwidth video will increase rapidly as fast LAN's and many other applications. The demand from such uses will are deployed and workstation video technology improves. probably have a variance that is small compared to the Multimegabit video streams will become commonplace in this is what the telephone network has For such usage patterns, many environments and pockets of high bandwidth users are provisioning the link so that overloads are rare requires a likely to develop. However, the access lines between these relatively small percentage increase in the capacity. Moreover, pockets and the rest of the Internet are likely to remain when so provisioned, the utilization is still moderately high. 31 omparatively slow and will thus frequently be overloaded. We expect most normal users would willingly pay the extra This isn't a technical problem about providing bandwidth, but expense to overprovision in return for having a very nerely an economic one. The cost of a high-speed LAN will blockage rate be much less than the cost of a high-speed access line into The term leading edge usage will refer to those flows the Internet, and for workgroups it is much more important with extremely high bandwidth usage. Here, the variance to have high-speed connectivity to coworkers than to the in demand is large compared to the average demand outside world. Clearly in the short-term there is no hope of Overprovisioning requires a large percentage increase in overprovisioning everywhere. the capacity of the link, and would result in low average The long-term analysis is considerably less clear. By long- utilization levels. Since we expect that a leading edge user term we mean when networking is a mature and competitive might consume as much as 1000 times the bandwidth of the dustry and workstation technology has progressed to a point average user, these leading edge users will always be able where getting high data rates from workstations onto networks to make a large impact on the network even if they make is no longer a significant bottleneck up a small fraction of the total population. The phone network in the United States is an example A simple example can illustrate the difference in variance of a network that, in its mature state, has been successfully between these two kinds of usage. We consider only real erprovisioned: the rate of call g is extremely low time traffic, so each flow has to establish a reservation. The in most areas. This overprovisioning is now part of users' demand will be modeled by a Poisson stream of flows with expectations: users in the United States would probably be arrival rate A(this is the arrival rate of newly established flows, extremely dissatisfied with a telephone network that had not the packet arrival rate). Each fow has an exponentially ignificant levels of blocking. If one provisioned the Internet distributed holding time with average holding time p-I that admission control rejected requests very rarely, then Define p = A. To model normal usage, we assume that ne needn t have implemented admission control in the first each fow consumes a unit amount of bandwidth. Thus place. Can the Internet follow the example of telephone the probability that the aggregate bandwidth consumption i greater than some capacity C(assumed, for There are some important differences between the Internet be an integer) is given by pc. To model leading edge usage. 和 network. A user of the phone network can we use a different arrival rate a but the same holding time ace a phone call. The bandwidth usage is distribution; define p =a. We further e that each nd the invocation of the call usually requires human leading edge flow consumes L units of bandwidth. Define action. Both of these factors limit the variability in aggregate the capacity C such that the probability of overflow(above telephone usage. In contrast, the future Internet load will be capacity C)in this leading edge system is the same as the much more variable. Video flows will range in bandwidth from probability of overflow(above capacity C)in the normal a few tens of kb/s to perhaps as high as 100 Mb/s, and some usage system. Then, C=LC Ine. Note that when we fix other applications such as data collection from remote sensors the leading edge usage to be some fraction r of normal usage may reach even higher bandwidth rates. Recall that in the p=ip, and we let the size of the leading edge jobs grow. central limit theorem, which says that the sum of individual then asymptotically C Cin. If we use the values p=0.5, distributions tends toward a Gaussian. the variance of the L=100, and r=0.1, then C a9.12.Thus, the capaci neans that the aggregate usage will be more variable. In the capacity needed to overprovision the normal system, even addition, because computer communications often do not involve humans (whose locations are fairly stable)on both ends, we expect that their usage pattern to be much more 30 There has been recent work that many aspects of comp unpredictable: for instance, the migration of a popular on-line seaw w ar the exture sof the imilar video repository could cause a major shift in traffic patterns. would make it more likely that the y The higher variation of internet traffic loads is the key reason why we think we are using vague terms like"moderately high" and"relatively small ioning of the Internet will not be because our main point is to compare this case to the leading edge case below