chapter 8 THE DISJOINT SET ADT 81 Equivalence relations Definition) Arelation R is defined on a set S if for every pair of elements(a, b),a, b ES, aR b is either true or false. If a b is true, then we say that a is related to b Definition A relation, over a set, S, is said to be an equivalence relation over S iff it is symmetric, reflexive, and transitive over S (Definition Two members x and y of a set S are said to be in the same equivalence class iffx-y
CHAPTER 8 THE DISJOINT SET ADT §1 Equivalence Relations 【Definition】A relation R is defined on a set S if for every pair of elements (a, b), a, b S, a R b is either true or false. If a R b is true, then we say that a is related to b. 【Definition】A relation, ~, over a set, S, is said to be an equivalence relation over S iff it is symmetric, reflexive, and transitive over S. 【Definition】Two members x and y of a set S are said to be in the same equivalence class iff x ~ y
82 The Dynamic Equivalence Problem Given an equivalence relation decide for any a and b if a-b Example〗 Given s={1,2,3,4,5,6,7,8,9,10,1,12}and9 relations:12=4,3=1,6≡10,8=9,7=4,6=8,3=5,2=11,1l=12 The equivalence classes are ( 2, 4, 7,11, 123,(,, 5,(6,8,9,) Algorithm:(Union /Find) [ /step 1: read the relations in*/ Initialize n disjoint sets; while( read in a -b)i if(! (Find (a== Find(b)) Union the two sets: 3/*end-while*/ / step 2: decide if a-b*/ while( read in a and b) if Find(a= Find(b)output( true ) else output( false )
§2 The Dynamic Equivalence Problem Given an equivalence relation ~, decide for any a and b if a ~ b. 〖Example〗 Given S = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 } and 9 relations: 124, 31, 610, 89, 74, 68, 35, 211, 1112. The equivalence classes are { 2, 4, 7, 11, 12 }, { 1, 3, 5 }, { 6, 8, 9, 10 } Algorithm: { /* step 1: read the relations in */ Initialize N disjoint sets; while ( read in a ~ b ) { if ( ! (Find(a) == Find(b)) ) Union the two sets; } /* end-while */ /* step 2: decide if a ~ b */ while ( read in a and b ) if ( Find(a) == Find(b) ) output( true ); else output( false ); } (Union / Find)
Elements of the sets:1.2.3.N e Sets:S1,S2,……andS;^S=(i≠j)— disjoint Example S1={6,7,8,10},S2={1,4,9},S3={2,3,5} Note: Pointers are from children 8) to parents A possible forest representation of these sets e Operations (1) Union(i)∷= Replace s;andS;bys=S∪S (2)Find (i): : Find the set Sk which contains the element i
Elements of the sets: 1, 2, 3, ..., N Sets : S1 , S2 , ... ... and Si Sj = ( if i j ) —— disjoint 〖Example〗 S1 = { 6, 7, 8, 10 }, S2 = { 1, 4, 9 }, S3 = { 2, 3, 5 } 10 6 7 8 4 1 9 2 3 5 A possible forest representation of these sets Note: Pointers are from children to parents Operations : (1) Union( i, j ) ::= Replace Si and Sj by S = Si Sj (2) Find( i ) ::= Find the set Sk which contains the element i
s3 Basic Data Structure ☆ Union(ij) Idea: Make S a subtree of s, or vice versa. that is, we can set the parent pointer of one of the roots to the other root. S1∪S Im plementation 1 S,∪S name[l S2 S3 ①⑨ → 2
§3 Basic Data Structure ❖ Union ( i, j ) Idea: Make Si a subtree of Sj , or vice versa. That is, we can set the parent pointer of one of the roots to the other root. 10 6 8 7 4 1 9 4 1 9 10 S 6 7 8 1 S2 S2 S1 Implementation 1: S1 S2 S3 • • • name[ ] 10 6 7 8 4 1 9 2 3 5 S2 S1 S2
Im plementation 2: s[element]=the elements parent. Note: S[ root]=0 and set name= root index. I Example The array representation of the three sets is Heave use中2图4ospo Hence they can be ual a of an ard void SetUnion( DisjSet S, (S1∪S SetType Rt1 SetType Rt2) I s[Rt2 ]=Rt1; y ☆Find(i) Im plementation 1: Implementation 2: SetType Find( elementType X, name啊ks DisjSet S) I for(; S[X]>0; X=SX]); return X; find(i)='s
Here we use the fact that the elements are numbered from 1 to N. Hence they can be used as indices of an array. Implementation 2: S [ element ] = the element’s parent. Note: S [ root ] = 0 and set name = root index. 10 6 7 8 4 1 9 2 3 5 〖Example〗The array representation of the three sets is ( S1 S2 S1 ) S [ 4 ] = 10 ❖ Find ( i ) Implementation 1: name[k] S • j i ... find ( i ) = ‘S’ Implementation 2: SetType Find ( ElementType X, DisjSet S ) { for ( ; S[X] > 0; X = S[X] ) ; return X ; } void SetUnion ( DisjSet S, SetType Rt1, SetType Rt2 ) { S [ Rt2 ] = Rt1 ; } [1] 4 [2] 0 [10] 0 [9] 4 [8] 10 [7] 10 [6] 10 [5] 2 [4] 0 [3] 2 S 10
令 Analysis Uvnneaking, union an/ Pmac Sure. Try this one T=⊙(N2)! union (2, 1), find (1); union (3, 2), find (1; That’snot good. union(N,N-1), find(1). e cl 3 5·)J,4 Algorithm usin INitialize for (k= 1 if Find SetUn
❖ Analysis Practically speaking, union and find are always paired. Thus we consider the performance of a sequence of unionfind operations. 〖Example〗 Given S = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 } and 9 relations: 124, 31, 610, 89, 74, 68, 35, 211, 1112. We have 3 equivalence classes { 2, 4, 7, 11, 12 }, { 1, 3, 5 }, and { 6, 8, 9, 10 } Algorithm using union-find operations { Initialize Si = { i } for i = 1, ..., 12 ; for ( k = 1; k <= 9; k++ ) { /* for each pair i j */ if ( Find( i ) != Find( j ) ) SetUnion( Find( i ), Find( j ) ); } } Can you think of a worst case example? Sure. Try this one: union(2, 1), find(1); union(3, 2), find(1); ..., ... ; union(N, N −1), find(1). N N−1 1 T = ( N2 ) ! That’s not good
84 Smart Union Algorithms 4 Union-by-Size - Always change the smaller tree s[ Root]=- size;/*initialized to be-1*/ (Lemma] Let T be a tree created by union-by-size with N nodes. then height)t回(N+1 for the worst case Proof: By induction.(Eaca ergava. have its set name changed at most times Time complexity of N (on aroM Find operations is now O(N+Mlog2 n). B Union-by-Height - Always change the shallow tree
§4 Smart Union Algorithms ❖ Union-by-Size -- Always change the smaller tree S [ Root ] = – size; /* initialized to be –1 */ Now T = O( N ) for the worst case example I gave. 1 2 N 【Lemma】Let T be a tree created by union-by-size with N nodes, then height(T) log2 N +1 Proof: By induction. (Each element can have its set name changed at most log2 N times.) Time complexity of N Union and M Find operations is now O( N + M log2 N ). ❖ Union-by-Height -- Always change the shallow tree
B Code of Union-by-height (rank) /Assume Root and Root are Roots Union is a C ketword, so this routine is named setUnion void SetUnion (DisjSet s, SetType Root1, SetType root2) if(s[Root2≤s[Root1) /Root2 is deeperset s[Root1]=Root2; //make Root2 new root else if(s[Root1]=s[Root2] /same height s[Root1]--; / so update s[Root2Root1;
/*Assume Root1 and Root2 are Roots Union is a C ketword, so this routine is named SetUnion */ void SetUnion(DisjSet s,SetType Root1, SetType Root2) { if(s[Root2]<s[Root1]) //Root2 is deeper set s[Root1]=Root2; //make Root2 new root else { if(s[Root1]==s[Root2]) //same height s[Root1]--; //so update s[Root2]=Root1; } } ❖ Code of Union-by-height (rank)
§5 Path Compression SetType Find elementType X, DisjSet s) if( s[x]0; root= SL find operations / find the root*/ for( trail =X; trail root; trail lead)t lead= S[ trail]; sI trail]= root Note: Not compatible with union- by- 3/ collapsing * return root height since it changes the heights. Just take“ height”as an estimatedrank
§5 Path Compression SetType Find ( ElementType X, DisjSet S ) { if ( S[ X ] 0; root = S[ root ] ) ; /* find the root */ for ( trail = X; trail != root; trail = lead ) { lead = S[ trail ] ; S[ trail ] = root ; } /* collapsing */ return root ; } Note: Not compatible with union-byheight since it changes the heights. Just take “height” as an estimated rank. Slower for a single find, but faster for a sequence of find operations
86 Worst Case for Union-by-Rank and Path Compression (Lemma(Tarjan)) Lett(M, N)be the maximum time required to process an intermixed sequence of finds and N-l unions. Then: k1Ma(M,N)≤T(M,N)≤k2Ma(M,N) for some positive constants K, and k2 G Ackermann's Function and a (M,N) log"2656=5 sInce 2 i=1andj≥1 logloglogloglog (26556)=1 636 A(i,j)={A(i-1,2) i≥2andj=1 A(i-1,A(i,j-1)i≥2andj≥2 http://mathworld.wolfram.com/ackermannfunctIon.html a(M,N)=mini≥1A(LM/N」)>logN}≤O(ogN)≤4 log N(inverse Ackermann function) of times the logarithm is applied to n until the result s 1
§6 Worst Case for Union-by-Rank and Path Compression 【Lemma (Tarjan)】Let T( M, N ) be the maximum time required to process an intermixed sequence of M N finds and N − 1 unions. Then: k1M ( M, N ) T( M, N ) k2M ( M, N ) for some positive constants k1 and k2 . − − − = = = ( 1, ( , 1)) 2 and 2 ( 1, 2) 2 and 1 2 1 and 1 ( , ) A i A i j i j A i i j i j A i j j (M, N) = min{ i 1| A(i, M N) log N } Ackermann’s Function and ( M, N ) O( log* N ) 4 http://mathworld.wolfram.com/AckermannFunction.html log* N (inverse Ackermann function) = # of times the logarithm is applied to N until the result 1. A(2,4) = ? 2 65536 2 2 2 2 2 = log* 2 65536 = 5 since logloglogloglog ( 265536 )= 1