Peer-to-Peer Networks ●● Distributed algorithms for p2P ●●●● Distributed hash tables ●●0 0●●● P Felber Pascal. Felberd eurecom fr
Peer-to-Peer Networks Distributed Algorithms for P2P Distributed Hash Tables P. Felber Pascal.Felber@eurecom.fr http://www.eurecom.fr/~felber/
●●●● ●●●● ●●●● ●●●● Agenda ●●●● ●●●● o What are dhTs? Why are they useful? What makes a good dht design ● Case studies Chord ● Pastry(ocat!y) TOPLUS( topology-awareness) What are the open problems? Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 2 Agenda ⚫ What are DHTs? Why are they useful? ⚫ What makes a “good” DHT design ⚫ Case studies ⚫ Chord ⚫ Pastry (locality) ⚫ TOPLUS (topology-awareness) ⚫ What are the open problems?
●●●● ●●●● ●●●● ●●●● What is p2P? ●●●● ●●●● A distributed system architecture o no centralized control e Typically many nodes, but unreliable and heterogeneous Nodes are symmetric in function Internet ake advantage of distributed shared resources (bandwidth, CPU, storage)on peer-nodes Fault-tolerant, self-organizing ● Operate in dynamic environment, frequent join and leave is the norm Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 3 What is P2P? ⚫ A distributed system architecture ⚫ No centralized control ⚫ Typically many nodes, but unreliable and heterogeneous ⚫ Nodes are symmetric in function ⚫ Take advantage of distributed, shared resources (bandwidth, CPU, storage) on peer-nodes ⚫ Fault-tolerant, self-organizing ⚫ Operate in dynamic environment, frequent join and leave is the norm Internet
●●●● ●●●● ●●●● ●●●● P2P Challenge: Locating Content0000 Who has have it this paper? I have it Simple strategy: expanding ring search until content is found If r of N nodes have copy, the expected search cost is at least N/r, i.e., O(N) Need many copies to keep overhead smal Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 4 P2P Challenge: Locating Content ⚫ Simple strategy: expanding ring search until content is found ⚫ If r of N nodes have copy, the expected search cost is at least N / r, i.e., O(N) ⚫ Need many copies to keep overhead small Who has this paper? I have it I have it
●●●● ●●●● ●●●● ●●●● Directed searches ●●●● ●●●● o Idea Assign particular nodes to hold particular content (or know where it is) o When a node wants this content, go to the node that is supposes to hold it (or know where it is ● Challenges Avoid bottlenecks: distribute the responsibilities evenly' among the existing nodes o Adaptation to nodes joining or leaving(or failing Give responsibilities to joining nodes Redistribute responsibilities from leaving nodes Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 5 Directed Searches ⚫ Idea ⚫ Assign particular nodes to hold particular content (or know where it is) ⚫ When a node wants this content, go to the node that is supposes to hold it (or know where it is) ⚫ Challenges ⚫ Avoid bottlenecks: distribute the responsibilities “evenly” among the existing nodes ⚫ Adaptation to nodes joining or leaving (or failing) ⚫ Give responsibilities to joining nodes ⚫ Redistribute responsibilities from leaving nodes
●●●● ●●●● ●●●● ●●●● Idea: hash tables ●0●● ●●●● a hash table associates data with keys lookup(key)→data hash table insert(key, data) 0 Key is hashed to find bucket in hash table key hash function >pos 3 Each bucket is expected to Beatles" h(key) %N 1 hold #items/#buckets items hash bucket e In a distributed hash table (DHT), nodes are the hash buckets Key is hashed to find lookup(key)→data responsible peer node insert(key, data) Data and load are balanced key hash function >pos across nodes " Beatles h(key)%N N-1 node Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 6 Idea: Hash Tables ⚫ A hash table associates data with keys ⚫ Key is hashed to find bucket in hash table ⚫ Each bucket is expected to hold #items/#buckets items ⚫ In a Distributed Hash Table (DHT), nodes are the hash buckets ⚫ Key is hashed to find responsible peer node ⚫ Data and load are balanced across nodes key pos 0 hash function 1 2 N-1 3 ... x y z lookup (key) → data insert (key, data) “Beattles” 2 hash table hash bucket h(key)%N 0 1 2 ... node key hash function pos lookup (key) → data insert (key, data) “Beattles” 2 h(key)%N N-1
●●●● ●●●● ●●●● ●●●● DHTS: Problems ●●●● ●●●● Problem 1(dynamicity): adding or removing nodes With hash mod N, virtually every key will change its location! hk)modm≠h(kmod(m+1)≠h(Kmod(m-1) Solution: use consistent hashing Define a fixed hash space All hash values fall within that space and do not depend on the number of peers(hash bucket) Each key goes to peer closest to its ID in hash space (according to some proximity metric Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 7 DHTs: Problems ⚫ Problem 1 (dynamicity): adding or removing nodes ⚫ With hash mod N, virtually every key will change its location! h(k) mod m ≠ h(k) mod (m+1) ≠ h(k) mod (m-1) ⚫ Solution: use consistent hashing ⚫ Define a fixed hash space ⚫ All hash values fall within that space and do not depend on the number of peers (hash bucket) ⚫ Each key goes to peer closest to its ID in hash space (according to some proximity metric)
●●●● ●●●● ●●●● ●●●● DHTS: Problems(cont'd) ●0●● ●●●● o Problem 2(size): all nodes must be known to insert or lookup data e Works with small and static server populations Solution: each peer knows of only a few neighbors Messages are routed through neighbors via multiple hops(overlay routing) Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 8 DHTs: Problems (cont’d) ⚫ Problem 2 (size): all nodes must be known to insert or lookup data ⚫ Works with small and static server populations ⚫ Solution: each peer knows of only a few “neighbors” ⚫ Messages are routed through neighbors via multiple hops (overlay routing)
●●●● ●●●● ●●●● ●●●● What Makes a Good DHT Design ●●●● ●●●● For each object, the node(s) responsible for that object should be reachable via a" path(small diameter) The different DHTs differ fundamentally only in the routing approach The number of neighbors for each node should remain reasonable(small degree) DHT routing mechanisms should be decentralized(no single point of failure or bottleneck Should gracefully handle nodes joining and leaving o Repartition the affected keys over existing nodes Reorganize the neighbor sets Bootstrap mechanisms to connect new nodes into the dht To achieve good performance, DHT must provide low stretch Minimize ratio of DHT routing Vs. unicast latency Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 9 What Makes a Good DHT Design ⚫ For each object, the node(s) responsible for that object should be reachable via a “short” path (small diameter) ⚫ The different DHTs differ fundamentally only in the routing approach ⚫ The number of neighbors for each node should remain “reasonable” (small degree) ⚫ DHT routing mechanisms should be decentralized (no single point of failure or bottleneck) ⚫ Should gracefully handle nodes joining and leaving ⚫ Repartition the affected keys over existing nodes ⚫ Reorganize the neighbor sets ⚫ Bootstrap mechanisms to connect new nodes into the DHT ⚫ To achieve good performance, DHT must provide low stretch ⚫ Minimize ratio of DHT routing vs. unicast latency
●●●● ●●●● ●●●● ●●●● DHT Interface ●●●● ●●●● Minimal interface(data-centric Lookup(key)->IP address Supports a wide range of applications, because few restrictions o Keys have no semantic meaning o value is application dependent o dhTs do not store the data o Data storage can be build on top of DhTS Lookup(key)→data Insert(key, data) Peer-to-Peer Networks -P. Felber
Peer-to-Peer Networks — P. Felber 10 DHT Interface ⚫ Minimal interface (data-centric) Lookup(key) → IP address ⚫ Supports a wide range of applications, because few restrictions ⚫ Keys have no semantic meaning ⚫ Value is application dependent ⚫ DHTs do not store the data ⚫ Data storage can be build on top of DHTs Lookup(key) → data Insert(key, data)