1.1.1 The Resource Discovery Problem _中国高校课件下载中心

正在加载图片...

1.1.1 The Resource Discovery Problem eral links,analogous to the addition of edges in a knowledge Harchol-Balter,Leighton,and Lewin 10 introduced the graph.Assuming n nodes in an initial knowledge graph that Resource Discovery Problem to model the situation in which is connected and undirected,they give a deterministic algo all the processes in an initial weakly connected knowledge rithm for leader election in O(n)messages and time O(n). graph learn the identities of all the other processes,prelim- Upon termination,the corresponding knowledge graph con- inary to cooperating in a distributed computation 10.In tains an edge from the leader to every process in the network. terms of the knowledge graph,the goal is to construct the and a path of length at most O(log n)from each non-leader complete graph from an arbitrary weakly connected graph to the leader,solving the Ad-hoc Resource Discovery Prob- Subsequently,the problem definition was relaxed to require lem as defined by Abraham and Dolev [2].The time bound that one process become the leader and learn the identities reflects the fact that there may be long chains of merges that of all the other processes,which each learn the identity of happen sequentially. the leader [14].This implies that the final knowledge graph contains a star(with bi-directional edges)on all the vertices 1.1.3 Parallel Sorting The problem has been considered in both synchronous and Algorithms for sorting on parallel machines have been asynchronous models of computation.In the synchronous studied extensively.The closest such algorithm to our work model,in each round each node may contact one or more of is that of Goodrich and Kosaraju 9 for a parallel pointer its neighbors in the current knowledge graph and exchange machine,which also proceeds by building a binary tree over some subset of its neighbor list.Harchol-Balter.Leighton nodes and then merging components according to the tree. and Lewin give a simple randomized algorithm to solve the Their algorithm makes use of a clever parallel linked-list complete-graph version of the problem with high probability merge operation that allows consecutive merging phases to in O(log n)rounds,O(nlogn)messages,and O(n log"n) be pipelined,giving an O(log n)total time.We believe that bit complexity.Law and Siu give another randomized al- essentially the same merging algorithm could be adapted to gorithm that improves each of these bounds by a factor of our model.although improvements in other parts of our al- log n 16 gorithm would be necessary for this change to improve our Kutten,Peleg,and Vishkin give a deterministic algorithm overall running time. to solve the star-graph version of the problem in O(log n) 1.1.4 Tree structures in previous work rounds,O(nlog n)messages,and O(Eo log-n)bit com- plexity,where Eo is the set of edges of the initial knowledge A distributed trie is used as a search structure for P2P sys- graph [14].A single additional round in which the leader tems in several previous papers.Karl Aberer [1]designed sends the whole list to all processes achieves a complete a scalable access structure named P-Grid based on a multi- graph,and adds O(n)to message complexity and O(n2log n) level trie with each node representing one specific interval to bit complexity. of the key space.And at each level every peer is associ- In the asynchronous model of computation,all messages ated with exactly one tree node and maintains references to sent will eventually arrive after a finite but unbounded time, cover the other side of the search tree so that a search can be and messages from u to v are delivered to v in the order in started at any peer.The convergence of P-Grid construc- which u sent them.Kutten and Peleg give a determinis- tion strongly depends on the assumption that peers meet tic algorithm to solve the star-graph version of the prob- frequently and randomly pairwise which is not so achievable lem in O(nlog n)messages and O(Eol log2n)bit complex- in application.Although the paper provides some simula- ity 13.Abraham and Dolev also consider asynchronous tion results,it doesn't give further proof.Michael J.Freed- computation,and generalize the problem to finding a leader man and Radek Vingralek [7]presented a similar approach in each weakly connected component of the initial knowl- while taking advantage of information piggybacking during edge graph 2.They show that knowledge of n,the size lookups to achieve dynamic partitioning and lazy updates. of a node's component,affects the achievable bounds.In However,the performance of the algorithm depends on the particular,when n is unknown,they give a lower bound of lookup pattern and the paper also lacks proof.Others have (n logn)on message complexity,and a deterministic algo- proposed using B-tree variants to index small numbers of rithm with message complexity O(n logn)and bit complex- nodes (hundreds)in distributed databases12.21. ity O(Eol logn nlog2n).When n is known,they give a deterministic algorithm with O(na(n))message complexity 2. MODEL and O(Eollogn +nlog'n)bit complexity,where a(n)is We assume that in the initial state each process knows the inverse Ackermann function.They also define the Ad- the identifiers of some number of other processes.This in- hoc Resource Discovery Problem,in which each non-leader formation forms a knowledge graph,a directed graph with must only have an identified path to its leader,rather than one vertex per process and an edge from u to v whenever a direct edge.For this problem,they show a lower bound u knows about v.The knowledge graph may evolve over on message complexity of (na(n)),and an algorithm that time as processes tell each other about other processes;if achieves message complexity O(na(n))and bit complexity u knows about both v and w,it may send a message to v O(Eol logn +n log2n).This algorithm also deals efficiently containing w's identifier,and when v receives this message with dynamic additions of nodes and links to the network. the edge vw is added to the graph.We assume throughout It is worth mentioning that our trees solve the Ad-hoc Re- that a process u can only send a message to another process source Discovery Problem,which means the Abraham and v if uv is present in the graph when the message is sent.We Dolev (na(n))lower bound for messages applies. assume that the initial knowledge graph Go is weakly con- nected and has maximum degree d,where the degree d(u) 1.1.2 Leader Election of a vertex u is defined to be the sum of its indegree d(u) Cidon.Gopal and Kutten 4 introduce a detailed and and its outdegree d(u),which are the number of incoming technologically motivated model in which processes in a net- and outgoing edges for u,respectively. work may use newly learned routes to send messages at Processes communicate by passing messages along edges cost O(1),although the physical route may consist of sev- of the knowledge graph.Formally,we assume that messages1.1.1 The Resource Discovery Problem Harchol-Balter, Leighton, and Lewin [10] introduced the Resource Discovery Problem to model the situation in which all the processes in an initial weakly connected knowledge graph learn the identities of all the other processes, preliminary to cooperating in a distributed computation [10]. In terms of the knowledge graph, the goal is to construct the complete graph from an arbitrary weakly connected graph. Subsequently, the problem definition was relaxed to require that one process become the leader and learn the identities of all the other processes, which each learn the identity of the leader [14]. This implies that the final knowledge graph contains a star (with bi-directional edges) on all the vertices. The problem has been considered in both synchronous and asynchronous models of computation. In the synchronous model, in each round each node may contact one or more of its neighbors in the current knowledge graph and exchange some subset of its neighbor list. Harchol-Balter, Leighton and Lewin give a simple randomized algorithm to solve the complete-graph version of the problem with high probability in O(log2 n) rounds, O(n log2 n) messages, and O(n 2 log3 n) bit complexity. Law and Siu give another randomized algorithm that improves each of these bounds by a factor of log n [16]. Kutten, Peleg, and Vishkin give a deterministic algorithm to solve the star-graph version of the problem in O(log n) rounds, O(n log n) messages, and O(|E0| log2 n) bit complexity, where E0 is the set of edges of the initial knowledge graph [14]. A single additional round in which the leader sends the whole list to all processes achieves a complete graph, and adds O(n) to message complexity and O(n 2 log n) to bit complexity. In the asynchronous model of computation, all messages sent will eventually arrive after a finite but unbounded time, and messages from u to v are delivered to v in the order in which u sent them. Kutten and Peleg give a deterministic algorithm to solve the star-graph version of the problem in O(n log n) messages and O(|E0| log2 n) bit complexity [13]. Abraham and Dolev also consider asynchronous computation, and generalize the problem to finding a leader in each weakly connected component of the initial knowledge graph [2]. They show that knowledge of n, the size of a node’s component, affects the achievable bounds. In particular, when n is unknown, they give a lower bound of Ω(n log n) on message complexity, and a deterministic algorithm with message complexity O(n log n) and bit complexity O(|E0| log n + n log2 n). When n is known, they give a deterministic algorithm with O(nα(n)) message complexity and O(|E0| log n + n log2 n) bit complexity, where α(n) is the inverse Ackermann function. They also define the Adhoc Resource Discovery Problem, in which each non-leader must only have an identified path to its leader, rather than a direct edge. For this problem, they show a lower bound on message complexity of Ω(nα(n)), and an algorithm that achieves message complexity O(nα(n)) and bit complexity O(|E0| log n+n log2 n). This algorithm also deals efficiently with dynamic additions of nodes and links to the network. It is worth mentioning that our trees solve the Ad-hoc Resource Discovery Problem, which means the Abraham and Dolev Ω(nα(n)) lower bound for messages applies. 1.1.2 Leader Election Cidon, Gopal and Kutten [4] introduce a detailed and technologically motivated model in which processes in a network may use newly learned routes to send messages at cost O(1), although the physical route may consist of several links, analogous to the addition of edges in a knowledge graph. Assuming n nodes in an initial knowledge graph that is connected and undirected, they give a deterministic algorithm for leader election in O(n) messages and time O(n). Upon termination, the corresponding knowledge graph contains an edge from the leader to every process in the network, and a path of length at most O(log n) from each non-leader to the leader, solving the Ad-hoc Resource Discovery Problem as defined by Abraham and Dolev [2]. The time bound reflects the fact that there may be long chains of merges that happen sequentially. 1.1.3 Parallel Sorting Algorithms for sorting on parallel machines have been studied extensively. The closest such algorithm to our work is that of Goodrich and Kosaraju [9] for a parallel pointer machine, which also proceeds by building a binary tree over nodes and then merging components according to the tree. Their algorithm makes use of a clever parallel linked-list merge operation that allows consecutive merging phases to be pipelined, giving an O(log n) total time. We believe that essentially the same merging algorithm could be adapted to our model, although improvements in other parts of our algorithm would be necessary for this change to improve our overall running time. 1.1.4 Tree structures in previous work A distributed trie is used as a search structure for P2P systems in several previous papers. Karl Aberer [1] designed a scalable access structure named P-Grid based on a multilevel trie with each node representing one specific interval of the key space. And at each level every peer is associated with exactly one tree node and maintains references to cover the other side of the search tree so that a search can be started at any peer. The convergence of P-Grid construction strongly depends on the assumption that peers meet frequently and randomly pairwise which is not so achievable in application. Although the paper provides some simulation results, it doesn’t give further proof. Michael J. Freedman and Radek Vingralek [7] presented a similar approach while taking advantage of information piggybacking during lookups to achieve dynamic partitioning and lazy updates. However, the performance of the algorithm depends on the lookup pattern and the paper also lacks proof. Others have proposed using B-tree variants to index small numbers of nodes (hundreds) in distributed databases [12, 21]. 2. MODEL We assume that in the initial state each process knows the identifiers of some number of other processes. This information forms a knowledge graph, a directed graph with one vertex per process and an edge from u to v whenever u knows about v. The knowledge graph may evolve over time as processes tell each other about other processes; if u knows about both v and w, it may send a message to v containing w’s identifier, and when v receives this message the edge vw is added to the graph. We assume throughout that a process u can only send a message to another process v if uv is present in the graph when the message is sent. We assume that the initial knowledge graph G0 is weakly connected and has maximum degree d, where the degree d(u) of a vertex u is defined to be the sum of its indegree d −(u) and its outdegree d +(u), which are the number of incoming and outgoing edges for u, respectively. Processes communicate by passing messages along edges of the knowledge graph. Formally, we assume that messages

<<向上翻页向下翻页>>