A Proofs A.1 The Proof of Theorem 1 P_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Distributed Power-law Graph Computing - Theoretical and Empirical Analysis

正在加载图片...

A Proofs A.1 The Proof of Theorem 1 Proof.Let the indicator H;denote the event that vertex v;has at least one of hi edges in the jth machine. Then the expectation E[H;]is EH;]=1-Pr(none of the h edges is on the machine j) =1-- For some vertex vi.hi adjacent edges are hashed by the neighbours of vi and (di-hi)adjacent edges are hashed by vi itself to the same machine.So for the(di-hi)edges,the number of replications of vi is simply 1 due to the assumption hi<di-1. As for the reuh:ofheadjacds,we have[H吲l=1-（L-）.Here历，involves theother p-1 machines except for the one that already has a replication. Putting the two parts together,we have 4加=1+=1+-)-(-》] =p--》+门 Thus,the expected replication factor is: 2叭=(--】 =-(-)鬥] 0 Note that we assume h<di-1.which means the hash function hashes at least one of the adjacent edges ofv by vi itself.Such assumption is to guarantee that there will be no"single"master vertex as which no adjacent edges are in the same partition. By Lemma 1,we obtain --)门s--] which implies that the hash-based vertex-cut via degree-approach is at least as good as the randomized vertex- cut in the replication factor. A.2 The Proof of Theorem 2 Proof.Since we assume that the vertices are evenly hashed to all the machines,the hi adjacent edges of vi are also evenly assigned to all the machines.Subsequently,each machine hasedges.Thus,we sum up all the vertices,obtaining For the rest d;-hi adjacent edges of vi,they are assigned to the same machine.So this part of edges incurs imbalance. In the above procedure each edge is actually assigned twice.Thus,the final result is max{e∈E|M(e)=mH 含+微品低- EV/p 2E/p ◇ 10A Proofs A.1 The Proof of Theorem 1 Proof. Let the indicator Hj denote the event that vertex vi has at least one of hi edges in the jth machine. Then the expectation E[Hj ] is E[Hj ] = 1 − Pr(none of the hi edges is on the machine j) = 1 − 1 − 1 p hi . For some vertex vi, hi adjacent edges are hashed by the neighbours of vi and (di − hi) adjacent edges are hashed by vi itself to the same machine. So for the (di − hi) edges, the number of replications of vi is simply 1 due to the assumption hi ≤ di − 1. As for the residual hi of the adjacent edges, we have E[Hj ] = 1 − 1 − 1 p hi . Here Hj involves the other p − 1 machines except for the one that already has a replication. Putting the two parts together, we have E [|A(vi)|] = 1 +Xp−1 j=1 E[Hj ] = 1 + (p − 1) 1 − 1 − 1 p hi = p 1 − 1 − 1 p hi+1 . Thus, the expected replication factor is: E " 1 n Xn i=1 |A(v)| # = 1 n Xn i=1 p 1 − 1 − 1 p hi+1 = p n Xn i=1 1 − 1 − 1 p hi+1 . Note that we assume hi ≤ di −1, which means the hash function hashes at least one of the adjacent edges of vi by vi itself. Such assumption is to guarantee that there will be no “single” master vertex as which no adjacent edges are in the same partition. By Lemma 1, we obtain p n Xn i=1 1 − 1 − 1 p hi+1 ≤ p n Xn i=1 1 − 1 − 1 p di , which implies that the hash-based vertex-cut via degree-approach is at least as good as the randomized vertexcut in the replication factor. A.2 The Proof of Theorem 2 Proof. Since we assume that the vertices are evenly hashed to all the machines, the hi adjacent edges of vi are also evenly assigned to all the machines. Subsequently, each machine has hi p edges. Thus, we sum up all the vertices, obtaining Pn i=1 hi p . For the rest di − hi adjacent edges of vi, they are assigned to the same machine. So this part of edges incurs imbalance. In the above procedure each edge is actually assigned twice. Thus, the final result is max m |{e ∈ E | M(e) = m}| |E|/p = Pn i=1 hi p + max j∈[p] P vi∈Pj (di − hi) 2|E|/p . 10

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Distributed Power-law Graph Computing - Theoretical and Empirical Analysis