Hypercube Fa2010 Parallel Processing, Low-Diameter Architectures Slide 1
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 1 Hypercube
LOW-Diameter Architectures Study the hypercube and related interconnection schemes Prime example of low-diameter(logarithmic) networks Theoretical properties, realizability, and scalability Complete our view of the sea of interconnection nets Fa2010 Parallel Processing, Low-Diameter Architectures Slide 3
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 3 Low-Diameter Architectures Study the hypercube and related interconnection schemes: • Prime example of low-diameter (logarithmic) networks • Theoretical properties, realizability, and scalability • Complete our view of the “sea of interconnection nets
Hypercubes and Their algorithms Study the hypercube and its topologicallalgorithmic properties Develop simple hypercube algorithms(more in Ch. 14) Learn about embeddings and their usefulness Topics in This Lecture 13.1 Definition and Main Properties 13.2 Embeddings and Their Usefulness 13.3 Embedding of Arrays and Trees 14.5 Broadcasting in Hypercube 14.6 Adaptive routing in Hypercube 15 Inverting a Lower-Triangular Matrix Fa2010 Parallel Processing, Low-Diameter Architectures Slide 4
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 4 Hypercubes and Their Algorithms Study the hypercube and its topological/algorithmic properties: • Develop simple hypercube algorithms (more in Ch. 14) • Learn about embeddings and their usefulness Topics in This Lecture 13.1 Definition and Main Properties 13.2 Embeddings and Their Usefulness 13.3 Embedding of Arrays and Trees 14.5 Broadcasting in Hypercube 14.6 Adaptive routing in Hypercube 15 Inverting a Lower-Triangular Matrix
13.1 Definition and main properties P P P Intermediate architectures logarithmic or sublogarithmic diameter P Begin studying networks that are intermediate between diameter -1 complete network and diameter-p12 mesh Sublogarithmic diameter Superlogarithmic diameter 2 log n/loglog n log n n/2 n Complete PDN Star Binary tree Torus Ring Linear network pancake hypercube arr ray Fa2010 Parallel Processing, Low-Diameter Architectures Slide 5
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 5 13.1 Definition and Main Properties P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 0 P P P P P P P P P 0 1 2 3 4 5 6 7 8 Intermediate architectures: logarithmic or sublogarithmic diameter Begin studying networks that are intermediate between diameter-1 complete network and diameter-p 1/2 mesh Complete network 1 2 log n / log log n log n n n/2 n − 1 Sublogarithmic diameter Superlogarithmic diameter PDN Star, pancake Binary tree, hypercube Torus Ring Linear array
Hypercube and Its History Binary tree has logarithmic diameter, but small bisection Hypercube has a much larger bisection Hypercube is a mesh with the maximum possible number of dimensions 2×2×2 2 9= log p We saw that increasing the number of dimensions made it harder to design and visualize algorithms for the mesh Oddly, at the extreme of log2 p dimensions things become simple again Brief history of the hypercube(binary g-cube)architecture Concept developed: early 1960s [Squi63 Direct(single-stage)and indirect(multistage)versions: mid 1970s Initial proposals[Peas771, [Sull77]included no hardware Caltech's 64-node Cosmic Cube: early 1980s [ Seit85 Introduced an elegant solution to routing(wormhole switching) Several commercial machines mid to late 1980s Intel PSc (personal supercomputer), CM-2, nCUBE(Section 22. 3) Fa2010 Parallel Processing, Low-Diameter Architectures Slide 6
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 6 Hypercube and Its History Binary tree has logarithmic diameter, but small bisection Hypercube has a much larger bisection Hypercube is a mesh with the maximum possible number of dimensions 2 2 2 . . . 2 ⎯ q = log2 p ⎯→ We saw that increasing the number of dimensions made it harder to design and visualize algorithms for the mesh Oddly, at the extreme of log2 p dimensions, things become simple again! Brief history of the hypercube (binary q-cube) architecture Concept developed: early 1960s [Squi63] Direct (single-stage) and indirect (multistage) versions: mid 1970s Initial proposals [Peas77], [Sull77] included no hardware Caltech’s 64-node Cosmic Cube: early 1980s [Seit85] Introduced an elegant solution to routing (wormhole switching) Several commercial machines: mid to late 1980s Intel PSC (personal supercomputer), CM-2, nCUBE (Section 22.3)
Basic Definitions 0 Hypercube is generic term;(a)Binary 1-cube (b)Binary 2-cube 3-cube, 4-cube,., g-cube built of two built of tw binary o-cubes binary 1-cubes, labeled o and 1 labeled 0 and 1 In specifIc cases Fig.13.1 The recursive structure of binary hypercubes (c)Binary 3-cube, built of two binary 2-cubes, labeled 0 and 1 Parameters: 2q 0 B=0/2=291 D=g=log2p d=g= log2p (d) Binary 4-cube, built of two binary 3-cubes, labeled 0 and 1 Fa2010 Parallel Processing, Low-Diameter Architectures Slide 7
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 7 Basic Definitions Hypercube is generic term; 3-cube, 4-cube, . . . , q-cube in specific cases 0 1 00 01 10 11 (a) Binary 1-cube, built of two binary 0-cubes, labeled 0 and 1 (b) Binary 2-cube, built of two binary 1-cubes, labeled 0 and 1 0 1 (c) Binary 3-cube, built of two binary 2-cubes, labeled 0 and 1 0 000 001 010 011 100 101 110 111 1 000 001 010 011 100 101 110 111 (d) Binary 4-cube, built of two binary 3-cubes, labeled 0 and 1 0 1 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Fig. 13.1 The recursive structure of binary hypercubes. Parameters: p = 2q B = p/2 = 2q–1 D = q = log2p d = q = log2p
The 64-node Hypercube Only sample wraparound links are shown to avoid clutter amorphic to the4×4×4 3D torus (each has 64×6/2 links) Fa2010 Parallel Processing, Low-Diameter Architectures Slide 8
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 8 Only sample wraparound links are shown to avoid clutter The 64-Node Hypercube Isomorphic to the 4 4 4 3D torus (each has 64 6/2 links)
Neighbors of a Node in a Hypercube q-14q-2…X2X1X iD of node x q-1q2…x2X1x0 dimension-0 neighbor; No(X) q-1q2…X2X1 Xo dimension-1 neighbor N,X) The neighbors of node x ⅩX1X q dimension-(g-1)neighbor; Na-1(x) 0100 Dim o Nodes whose labels differ in k bits Dim 3 (at Hamming distance k) connected Dim 2 1100 by shortest path of length k 0000 Dim 110 Both node-and edge-symmetric l111 Strengths: symmetry, log diameter, 0110 and linear bisection width 0111 Weakness: poor scalability 1010101 Fa2010 Parallel Processing, Low-Diameter Architectures Slide 9
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 9 Neighbors of a Node in a Hypercube xq–1xq–2 . . . x2x1x0 ID of node x xq–1xq–2 . . . x2x1x0 dimension-0 neighbor; N0 (x) xq–1xq–2 . . . x2x1 x0 dimension-1 neighbor; N1 (x) . . . . . . xq–1 xq–2 . . . x2x1x0 dimension-(q–1) neighbor; Nq–1 (x) The q neighbors of node x Nodes whose labels differ in k bits (at Hamming distance k) connected by shortest path of length k Both node- and edge-symmetric Strengths: symmetry, log diameter, and linear bisection width Weakness: poor scalability Dim 0 Dim 1 Dim 2 Dim 3 0100 0101 0110 0000 1100 1101 1111 0111 0011 x 1011 0010 1010 x
13.2 Embeddings and Their Usefulness Dilation Congestion 5Fig.132 \b Load tactor= Embedding a even-node 0 2 binary tree f into 2D 3」|4 meshes of Various sizes Dilation 2 Dilation Congestion =2 Congestion =2 Load factor Load factor s b「 Expansion: tio of the number of e nodes(9/7,8/7 3 4 5 and 4/7 here) Dilation: Longest path onto which an edge is mapped (routing slowdown) Congestion Max number of edges mapped onto one edge(contention slowdown) Load factor: Max number of nodes mapped onto one node(processing slowdown) Fa2010 Parallel Processing, Low-Diameter Architectures Slide 10
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 10 Dilation: Longest path onto which an edge is mapped (routing slowdown) Congestion: Max number of edges mapped onto one edge (contention slowdown) Load factor: Max number of nodes mapped onto one node (processing slowdown) 13.2 Embeddings and Their Usefulness Fig. 13.2 Embedding a seven-node binary tree into 2D meshes of various sizes. 0 2 3 4 1 5 6 0 2 3 4 1 6 5 0,1 2 4 3 6 5 6 1 0 3,4 2,5 a b c d e f a b c d e f a b c d e f b c, d f Dilation = 1 Congestion = 1 Load factor = 1 Dilation = 2 Congestion = 2 Load factor = 1 Dilation = 1 Congestion = 2 Load factor = 2 Expansion: ratio of the number of nodes (9/7, 8/7, and 4/7 here)
13. 3 Embedding of Arrays and Trees g-1)-bit Gray code ○ 0000...000 0000...001 0000...011 O No?(Nfx) 0100...000 (q? l-cube 0 ( q? l-cube 1:100...000 Fig. 13.3 Hamiltonian cycle in the g-cube Alternate inductive proof: Hamiltonicity of the q-cube 000...011 is equivalent to the existence of a g-bit Gray code 1000...010 000...000 Basis: q-bit Gray code beginning with the all-Os codeword (9-1)-bit and ending with 10q-1 exists for g= 2: 00, 01, 11, 10 Gray code In reverse Fa2010 Parallel Processing, Low-Diameter Architectures Slide 11
Fall 2010 Parallel Processing, Low-Diameter Architectures Slide 11 13.3 Embedding of Arrays and Trees Alternate inductive proof: Hamiltonicity of the q-cube is equivalent to the existence of a q-bit Gray code Fig. 13.3 Hamiltonian cycle in the q-cube. (q ?1)-cube 0 x (q ?1)-cube 1 N (x) k N (x) q? N (N (x)) q? k (q – 1)-bit Gray code 000 . . . 000 000 . . . 001 000 . . . 011 . . . 100 . . . 000 0 0 0 0 1 1 1 1 100 . . . 000 . . . 000 . . . 011 000 . . . 010 000 . . . 000 (q – 1)-bit Gray code in reverse Basis: q-bit Gray code beginning with the all-0s codeword and ending with 10q–1 exists for q = 2: 00, 01, 11, 10