正在加载图片...
7.1 DHTs and directory services latency. Squirrel proposes web caching on a traditional DHT, although only for organization-wide networks [10] a distributed hash table(DHT)exposes two basic func- Squirrel reported poor load-balancing when the system tions to the application: put(key, value) stores a value stored pointers in the dht We attribute this to the Dht's at the specified key ID; get(key ) returns this stored value, just as in a normal hash table. Most DHTs use a key-based inability to handle too many values for the same key Squirrel only stored 4 pointers per object-while Coral- routing layer-such as CAN [251, Chord [311,Kadem- CDN references many more proxies by storing different lia[17], Pastry [26, or Tapestry [35F--and store keys on sets of pointers on different nodes. SCAN examined repli- the node whose ID is closest to the key. Keys must be cation policies for data disseminated through a multicast well distributed to balance load among nodes. DHTs often tree from a dht deployed at ISPs [3] replicate multiply-fetched key/value pairs for scalability, Akamai [1] and other commercial CDNs use DNS redi- e.g., by having peers replicate the pair onto the second-to- rection to reroute client requests to local clusters of ma- last peer they contacted as part of a get request. DHTs can act either as actual data stores or merely a combination of Bgp feeds and their own measurements as directory services storing pointers. CFS [5] and such as traceroutes from numerous vantage points [28] PAST [27] take the former approach to build a distri- Then, upon reaching a cluster of collocated machines, buted file system: They require true read /write consis- hashing schemes [11, 32] map requests to specific ma- tency among operations, where writes should atomically chines to increase capacity. These systems require de- replace previously-stored values, not modify them ploying large numbers of highly provisioned servers, and Using the network as a directory service, Tapestry [35] typically result in very good performance(both latency and Coral relax the consistency of operations in the net and throughput) for customers work. To put a key, Tapestry routes along fast hops be- Such centrally-managed CDNs appear to offer two ben- tween peers, placing at each peer a pointer back to the efits over CoralCDN. (1)CoralCDN's network measure sending node, until it reaches the node closest to the ments, via traceroute-like probing of DNS clients,are key. Nearby nodes routing to the same key are likely somewhat constrained in comparison. CoralCDN nodes to follow similar paths and discover these cached point- do not have BGP feeds and are under tight latency con- ers. Coral's flexible clustering provides similar latency- straints to avoid delaying DNS replies while probing. Ad optimized lookup and data placement, and its algorithms ditionally, Corals design assumes that no single node prevent multiple stores from forming hot spots. SkipNet even knows the identity of all other nodes in the system, also builds a hierarchy of lookup groups, although it ex- let alone their precise network location. Yet, if many peo- plicitly groups nodes by domain name to support organi- ple adopt the system, it will build up a rich database of zational disconnect [91 neighboring networks. (2)CoralCDn offers less aggre gate storage capacity, as cache management is completel 7. 2 Web caching and content distribution localized. But, it is designed for a much larger number of machines and vantage points: CoralCdN may provide b caching systems fit within a large class of CDNs that better performance for small organizations hosting nodes handle high demand through diverse replication as it is not economically efficient for commercial CDNs Prior to the recent interest in peer-to-peer systems, sev- to deploy machines behind eral projects proposed cooperative Web caching [2,7,8, More recently, CoDeeN has provided users with a set 16]. These systems either multicast queries or require of open web proxies [23]. Users can reconfigure their that caches know some or all other servers, which wors- browsers to use a Codeen proxy and subsequently en- ens their scalability, fault-tolerance, and susceptibility to joy better performance. The system has been deployed hot spots. Although the cache hit rate of cooperative web and anecdotal evidence suggests it is very successful at caching increases only to a certain level, corresponding to distributing content efficiently. Earlier simulation results ion size [ 34] ghly-scalable coopera- show that certain policies should achieve high system tive systems can still increase the total system throughput throughput and low request latency [33].( Specific details by reducing server-side load a Several projects have considered peer-to-peer overlays cluding an Akamai-like service also in developmen,/ of the deployed system have not yet been published, in- to reduce server load. Stading et al. use a DHT to cache gives most users better performance to participating web replicas [291, and PROOFS uses a randomized overlay to sites--namely those whose publishers have"Coralized" distribute popular content [30]. Both systems focus solely the URLS. The two design points pose somewhat dif- on mitigating flash crowds and suffer from high request7.1 DHTs and directory services A distributed hash table (DHT) exposes two basic func￾tions to the application: put(key, value) stores a value at the specified key ID; get(key) returns this stored value, just as in a normal hash table. Most DHTs use a key-based routing layer—such as CAN [25], Chord [31], Kadem￾lia [17], Pastry [26], or Tapestry [35]—and store keys on the node whose ID is closest to the key. Keys must be well distributed to balance load among nodes. DHTs often replicate multiply-fetched key/value pairs for scalability, e.g., by having peers replicate the pair onto the second-to￾last peer they contacted as part of a get request. DHTs can act either as actual data stores or merely as directory services storing pointers. CFS [5] and PAST [27] take the former approach to build a distri￾buted file system: They require true read/write consis￾tency among operations, where writes should atomically replace previously-stored values, not modify them. Using the network as a directory service, Tapestry [35] and Coral relax the consistency of operations in the net￾work. To put a key, Tapestry routes along fast hops be￾tween peers, placing at each peer a pointer back to the sending node, until it reaches the node closest to the key. Nearby nodes routing to the same key are likely to follow similar paths and discover these cached point￾ers. Coral’s flexible clustering provides similar latency￾optimized lookup and data placement, and its algorithms prevent multiple stores from forming hot spots. SkipNet also builds a hierarchy of lookup groups, although it ex￾plicitly groups nodes by domain name to support organi￾zational disconnect [9]. 7.2 Web caching and content distribution Web caching systems fit within a large class of CDNs that handle high demand through diverse replication. Prior to the recent interest in peer-to-peer systems, sev￾eral projects proposed cooperative Web caching [2, 7, 8, 16]. These systems either multicast queries or require that caches know some or all other servers, which wors￾ens their scalability, fault-tolerance, and susceptibility to hot spots. Although the cache hit rate of cooperative web caching increases only to a certain level, corresponding to a moderate population size [34], highly-scalable coopera￾tive systems can still increase the total system throughput by reducing server-side load. Several projects have considered peer-to-peer overlays for web caching, although all such systems only benefit participating clients and thus require widespread adoption to reduce server load. Stading et al. use a DHT to cache replicas [29], and PROOFS uses a randomized overlay to distribute popular content [30]. Both systems focus solely on mitigating flash crowds and suffer from high request latency. Squirrel proposes web caching on a traditional DHT, although only for organization-wide networks [10]. Squirrel reported poor load-balancing when the system stored pointers in the DHT. We attribute this to the DHT’s inability to handle too many values for the same key— Squirrel only stored 4 pointers per object—while Coral￾CDN references many more proxies by storing different sets of pointers on different nodes. SCAN examined repli￾cation policies for data disseminated through a multicast tree from a DHT deployed at ISPs [3]. Akamai [1] and other commercial CDNs use DNS redi￾rection to reroute client requests to local clusters of ma￾chines, having built detailed maps of the Internet through a combination of BGP feeds and their own measurements, such as traceroutes from numerous vantage points [28]. Then, upon reaching a cluster of collocated machines, hashing schemes [11, 32] map requests to specific ma￾chines to increase capacity. These systems require de￾ploying large numbers of highly provisioned servers, and typically result in very good performance (both latency and throughput) for customers. Such centrally-managedCDNs appear to offer two ben￾efits over CoralCDN. (1) CoralCDN’s network measure￾ments, via traceroute-like probing of DNS clients, are somewhat constrained in comparison. CoralCDN nodes do not have BGP feeds and are under tight latency con￾straints to avoid delaying DNS replies while probing. Ad￾ditionally, Coral’s design assumes that no single node even knows the identity of all other nodes in the system, let alone their precise network location. Yet, if many peo￾ple adopt the system, it will build up a rich database of neighboring networks. (2) CoralCDN offers less aggre￾gate storage capacity, as cache management is completely localized. But, it is designed for a much larger number of machines and vantage points: CoralCDN may provide better performance for small organizations hosting nodes, as it is not economically efficient for commercial CDNs to deploy machines behind most bottleneck links. More recently, CoDeeN has provided users with a set of open web proxies [23]. Users can reconfigure their browsers to use a CoDeeN proxy and subsequently en￾joy better performance. The system has been deployed, and anecdotal evidence suggests it is very successful at distributing content efficiently. Earlier simulation results show that certain policies should achieve high system throughput and low request latency [33]. (Specific details of the deployed system have not yet been published, in￾cluding an Akamai-like service also in development.) Although CoDeeN gives participating users better per￾formance to most web sites, CoralCDN’s goal is to gives most users better performance to participating web sites—namely those whose publishers have “Coralized” the URLs. The two design points pose somewhat dif- 12
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有