正在加载图片...
Resilient overlay Networks David Andersen Hari Balakrishnan, Frans Kaashoek and robert morris MIT Laboratory for Computer Science @nms. Ics. mit. edI http://nms.Ics.mitedu/ron/ Abstract and its constituent networks, usually operated by some network ser- a Resilient Overlay Network(RON) is an architecture that allow vice provider. The information shared with other providers and istributed Internet applications to detect and recover from path AS's is heavily filtered and summarized using the border Gat outages and periods of degraded performance within several sec- Protocol (BGP-4)running at the border routers between ASs[21] onds, improving over today's wide-area routing protocols that take which allows the Internet to scale to millions of networks at least several minutes to recover. A RON is an application-layer This wide-area routing scalability comes at the cost of re- overlay on top of the existing Internet routing substrate, The RON duced fault-tolerance of end-to-end communication between Inter- nodes monitor the functioning and quality of the Internet paths net hosts. This cost arises because BGP hides many topological among themselves, and use this information to decide whether to details in the interests of scalability and policy enforcement, has route packets directly over the Internet or by way of other Ron little information about traffic conditions, and damps routing up- problems arise to prevent large-scale oscil- Results from two sets of measurements of a working ron de- lations. As a result, BGP's fault recovery mechanisms sometimes ployed at sites scattered across the Internet demonstrate the benefits take many minutes before routes converge to a consistent form [121, f our architecture. For instance, over a 64-hour sampling period March 2001 across a twelve-node RoN, there were 32 significant ruptions in communication lasting tens of minutes or more B3, 18, outages, each lasting over thirty minutes, over the 132 measured 19]. The result is that today' s Internet is vulnerable to router and paths. RON,s routing mechanism was able to detect, recover, and link faults, configuration errors, and malice-hardly a week goes route around all of them, in less than twenty seconds on average, by without some serious problem affecting the connectivity pro- showing that its methods for fault detection and recovery work well ded by one or more Internet Service Providers(IsPs)[15] at discovering alternate paths in the Internet. Furthermore, RON Resilient Overlay Networks (RONs) are a remedy for some of was able to improve the loss rate, latency, or throughput perceived these problems. Distributed applications layer a"resilient overlay by data transfers; for example, about 5% of the transfers doubled etwork"over the underlying Internet routing substrate. The nodes their TCP throughput and 5% of our transfers saw their loss prob- comprising a ron reside in a variety of routing domains, and co- ability reduced by o.05. We found that forwarding packets via at operate with each other to forward data on behalf of any pair of and improve performance in most cases. These improvements, par- administrated and configured, and routing domains rarely share in- Gs ularly in the area of fault detection and recovery, demonstrate the terior links, they generally fail independently of each other. A nefits of moving some of the control over routing into the hands a result, if the underlying topology has physical path redundancy RON can often find paths between its nodes, even when wide-area outing Internet protocols like BGP-4 cannot. 1. ntroduction The main goal of RoN is to enable a group of nodes to commu- nicate with each other in the face of problems with the underlying The Internet is organized as independently u- Internet paths connecting them. ron detects problems by aggres- ms(Ass)that peer together. In th cture, sively probing and monitoring the paths connecting its nodes. If detailed routing information is maintained only with the underlying Internet path is the best one, that path is used and no other ron node is involved in the forwarding path. If the Internet Defense Advanced research path is not the best one, the RoN will forward the packet by way of around most failures by using only one intermediate hop RON nodes exchange information about the quality of the paths among themselves via a routing protocol and build forwarding ta- bles based on a variety of path metrics, including latency, packet loss rate, and available throughput. Each ROn node obtains the assive observations of data transfers. In mentation, each ROn is explicitly designed to be limited in size- between two and fifty nodes-to facilitate aggressive path main- enance via probing without excessive bandwidth overhead. ThisResilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT Laboratory for Computer Science ron@nms.lcs.mit.edu http://nms.lcs.mit.edu/ron/ Abstract A Resilient Overlay Network (RON) is an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance within several sec￾onds, improving over today’s wide-area routing protocols that take at least several minutes to recover. A RON is an application-layer overlay on top of the existing Internet routing substrate. The RON nodes monitor the functioning and quality of the Internet paths among themselves, and use this information to decide whether to route packets directly over the Internet or by way of other RON nodes, optimizing application-specific routing metrics. Results from two sets of measurements of a working RON de￾ployed at sites scattered across the Internet demonstrate the benefits of our architecture. For instance, over a 64-hour sampling period in March 2001 across a twelve-node RON, there were 32 significant outages, each lasting over thirty minutes, over the 132 measured paths. RON’s routing mechanism was able to detect, recover, and route around all of them, in less than twenty seconds on average, showing that its methods for fault detection and recovery work well at discovering alternate paths in the Internet. Furthermore, RON was able to improve the loss rate, latency, or throughput perceived by data transfers; for example, about 5% of the transfers doubled their TCP throughput and 5% of our transfers saw their loss prob￾ability reduced by 0.05. We found that forwarding packets via at most one intermediate RON node is sufficient to overcome faults and improve performance in most cases. These improvements, par￾ticularly in the area of fault detection and recovery, demonstrate the benefits of moving some of the control over routing into the hands of end-systems. 1. Introduction The Internet is organized as independently operating au￾tonomous systems (AS’s) that peer together. In this architecture, detailed routing information is maintained only within a single AS This research was sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Space and Naval Warfare Sys￾tems Center, San Diego, under contract N66001-00-1-8933. and its constituent networks, usually operated by some network ser￾vice provider. The information shared with other providers and AS’s is heavily filtered and summarized using the Border Gateway Protocol (BGP-4) running at the border routers between AS’s [21], which allows the Internet to scale to millions of networks. This wide-area routing scalability comes at the cost of re￾duced fault-tolerance of end-to-end communication between Inter￾net hosts. This cost arises because BGP hides many topological details in the interests of scalability and policy enforcement, has little information about traffic conditions, and damps routing up￾dates when potential problems arise to prevent large-scale oscil￾lations. As a result, BGP’s fault recovery mechanisms sometimes take many minutes before routes converge to a consistent form [12], and there are times when path outages even lead to significant dis￾ruptions in communication lasting tens of minutes or more [3, 18, 19]. The result is that today’s Internet is vulnerable to router and link faults, configuration errors, and malice—hardly a week goes by without some serious problem affecting the connectivity pro￾vided by one or more Internet Service Providers (ISPs) [15]. Resilient Overlay Networks (RONs) are a remedy for some of these problems. Distributed applications layer a “resilient overlay network” over the underlying Internet routing substrate. The nodes comprising a RON reside in a variety of routing domains, and co￾operate with each other to forward data on behalf of any pair of communicating nodes in the RON. Because AS’s are independently administrated and configured, and routing domains rarely share in￾terior links, they generally fail independently of each other. As a result, if the underlying topology has physical path redundancy, RON can often find paths between its nodes, even when wide-area routing Internet protocols like BGP-4 cannot. The main goal of RON is to enable a group of nodes to commu￾nicate with each other in the face of problems with the underlying Internet paths connecting them. RON detects problems by aggres￾sively probing and monitoring the paths connecting its nodes. If the underlying Internet path is the best one, that path is used and no other RON node is involved in the forwarding path. If the Internet path is not the best one, the RON will forward the packet by way of other RON nodes. In practice, we have found that RON can route around most failures by using only one intermediate hop. RON nodes exchange information about the quality of the paths among themselves via a routing protocol and build forwarding ta￾bles based on a variety of path metrics, including latency, packet loss rate, and available throughput. Each RON node obtains the path metrics using a combination of active probing experiments and passive observations of on-going data transfers. In our imple￾mentation, each RON is explicitly designed to be limited in size— between two and fifty nodes—to facilitate aggressive path main￾tenance via probing without excessive bandwidth overhead. This
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有