Introduction Parallel machines have become quite common and affordable prices of microprocessors,memory and disks have dropped sharply Data storage needs are growing increasingly large user data at web-scale 100's of millions of users,petabytes of data transaction data are collected and stored for analysis. multimedia objects like images/videos Parallel storage system requirements storing large volumes of data processing time-consuming decision-support queries providing high throughput for transaction processing Very high demands on scalability and availability Database System Concepts-7th Edition 21.2 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.2 ©Silberschatz, Korth and Sudarshan th Edition Introduction ▪ Parallel machines have become quite common and affordable • prices of microprocessors, memory and disks have dropped sharply ▪ Data storage needs are growing increasingly large • user data at web-scale ▪ 100’s of millions of users, petabytes of data • transaction data are collected and stored for analysis. • multimedia objects like images/videos ▪ Parallel storage system requirements • storing large volumes of data • processing time-consuming decision-support queries • providing high throughput for transaction processing • Very high demands on scalability and availability
Parallel/Distributed Data Storage History ■1980/1990s Distributed database systems with tens of nodes ■2000s: Distributed file systems with 1000s of nodes Millions of Large objects(100's of megabytes) Web logs,images,videos,... Typically create/append only Distributed data storage systems with 1000s of nodes Billions to trillions of smaller(kilobyte to megabyte)objects Social media posts,email,online purchases,.. Inserts,updates,deletes ·Key-value stores 2010s:Distributed database systems with 1000s of nodes Database System Concepts-7th Edition 21.3 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.3 ©Silberschatz, Korth and Sudarshan th Edition Parallel/Distributed Data Storage History ▪ 1980/1990s • Distributed database systems with tens of nodes ▪ 2000s: • Distributed file systems with 1000s of nodes ▪ Millions of Large objects (100’s of megabytes) ▪ Web logs, images, videos, … ▪ Typically create/append only • Distributed data storage systems with 1000s of nodes ▪ Billions to trillions of smaller (kilobyte to megabyte) objects ▪ Social media posts, email, online purchases, … ▪ Inserts, updates, deletes • Key-value stores ▪ 2010s: Distributed database systems with 1000s of nodes
I/O Parallelism Reduce the time required to retrieve relations from disk by partitioning the relations on multiple disks,on multiple nodes (computers) Our description focuses on parallelism across nodes Same techniques can be used across disks on a node Horizontal partitioning-tuples of a relation are divided among many nodes such that some subset of tuple resides on each node. Contrast with vertical partitioning,e.g.r(A,B,C,D)with primary key A into r1(A,B)and r2(A,C,D) By default,the word partitioning refers to horizontal partitioning Database System Concepts-7th Edition 21.4 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.4 ©Silberschatz, Korth and Sudarshan th Edition I/O Parallelism ▪ Reduce the time required to retrieve relations from disk by partitioning the relations on multiple disks, on multiple nodes (computers) • Our description focuses on parallelism across nodes • Same techniques can be used across disks on a node ▪ Horizontal partitioning – tuples of a relation are divided among many nodes such that some subset of tuple resides on each node. • Contrast with vertical partitioning, e.g. r(A,B,C,D) with primary key A into r1(A,B) and r2(A,C,D) • By default, the word partitioning refers to horizontal partitioning
I/O Parallelism Partitioning techniques(number of nodes n): Round-robin: Send the ith tuple inserted in the relation to node i mod n. Hash partitioning: Choose one or more attributes as the partitioning attributes. Choose hash function h with range 0...n-1 Let i denote result of hash function h applied to the partitioning attribute value of a tuple.Send tuple to node i. Database System Concepts-7th Edition 21.5 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.5 ©Silberschatz, Korth and Sudarshan th Edition I/O Parallelism ▪ Partitioning techniques (number of nodes = n): Round-robin: Send the i th tuple inserted in the relation to node i mod n. Hash partitioning: • Choose one or more attributes as the partitioning attributes. • Choose hash function h with range 0…n - 1 • Let i denote result of hash function h applied to the partitioning attribute value of a tuple. Send tuple to node i
Range Partitioning Range partitioning Range associated vector Node with the node Node 1 [-o,15) 15 Node 2 [15,40) 40 75 Node 3 [40,75) Node 4 [75,+∞] Database System Concepts-7th Edition 21.6 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 21.6 ©Silberschatz, Korth and Sudarshan th Edition Range Partitioning
I/O Parallelism (Cont.) Partitioning techniques(cont.): 。Range partitioning: Choose an attribute as the partitioning attribute. A partitioning vector [vo,v1,...vn-2]is chosen. Let v be the partitioning attribute value of a tuple.Tuples such that v Vn-2 go to node n-1. E.g.,with a partitioning vector [5,11] a tuple with partitioning attribute value of 2 will go to node 0, a tuple with value 8 will go to node 1,while a tuple with value 20 will go to node2. Database System Concepts-7th Edition 21.7 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.7 ©Silberschatz, Korth and Sudarshan th Edition I/O Parallelism (Cont.) Partitioning techniques (cont.): ▪ Range partitioning: • Choose an attribute as the partitioning attribute. • A partitioning vector [vo , v1 , ..., vn-2 ] is chosen. • Let v be the partitioning attribute value of a tuple. Tuples such that vi vi+1 go to node I + 1. Tuples with v < v0 go to node 0 and tuples with v vn-2 go to node n-1. E.g., with a partitioning vector [5,11] ▪ a tuple with partitioning attribute value of 2 will go to node 0, ▪ a tuple with value 8 will go to node 1, while ▪ a tuple with value 20 will go to node2
Comparison of Partitioning Techniques Evaluate how well partitioning techniques support the following types of data access: 1.Scanning the entire relation. 2.Locating a tuple associatively-point queries. ■E.g,rA=25. 3. Locating all tuples such that the value of a given attribute lies within a specified range-range queries. ■E.g,10≤r.A<25. Do above evaluation for each of ·Round robin ·Hash partitioning ·Range partitioning Database System Concepts-7th Edition 21.8 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.8 ©Silberschatz, Korth and Sudarshan th Edition Comparison of Partitioning Techniques ▪ Evaluate how well partitioning techniques support the following types of data access: 1. Scanning the entire relation. 2. Locating a tuple associatively – point queries. ▪ E.g., r.A = 25. 3. Locating all tuples such that the value of a given attribute lies within a specified range – range queries. ▪ E.g., 10 r.A < 25. ▪ Do above evaluation for each of • Round robin • Hash partitioning • Range partitioning
Comparison of Partitioning Techniques(Cont.) Round robin: Best suited for sequential scan of entire relation on each query. All nodes have almost an equal number of tuples;retrieval work is thus well balanced between nodes. All queries must be processed at all nodes Hash partitioning: Good for sequential access Assuming hash function is good,and partitioning attributes form a key, tuples will be equally distributed between nodes Good for point queries on partitioning attribute Can lookup single node,leaving others available for answering other queries. Range queries inefficient,must be processed at all nodes Database System Concepts-7th Edition 21.9 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.9 ©Silberschatz, Korth and Sudarshan th Edition Comparison of Partitioning Techniques (Cont.) Round robin: ▪ Best suited for sequential scan of entire relation on each query. • All nodes have almost an equal number of tuples; retrieval work is thus well balanced between nodes. ▪ All queries must be processed at all nodes Hash partitioning: ▪ Good for sequential access • Assuming hash function is good, and partitioning attributes form a key, tuples will be equally distributed between nodes ▪ Good for point queries on partitioning attribute • Can lookup single node, leaving others available for answering other queries. ▪ Range queries inefficient, must be processed at all nodes
Comparison of Partitioning Techniques(Cont.) Range partitioning: Provides data clustering by partitioning attribute value. Good for sequential access Good for point queries on partitioning attribute:only one node needs to be accessed. For range queries on partitioning attribute,one to a few nodes may need to be accessed Remaining nodes are available for other queries. Good if result tuples are from one to a few blocks. But if many blocks are to be fetched,they are still fetched from one to a few nodes,and potential parallelism in disk access is wasted Example of execution skew. Database System Concepts-7th Edition 21.10 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 21.10 ©Silberschatz, Korth and Sudarshan th Edition Comparison of Partitioning Techniques (Cont.) Range partitioning: ▪ Provides data clustering by partitioning attribute value. • Good for sequential access • Good for point queries on partitioning attribute: only one node needs to be accessed. ▪ For range queries on partitioning attribute, one to a few nodes may need to be accessed • Remaining nodes are available for other queries. • Good if result tuples are from one to a few blocks. • But if many blocks are to be fetched, they are still fetched from one to a few nodes, and potential parallelism in disk access is wasted ▪ Example of execution skew
Handling Small Relations Partitioning not useful for small relations which fit into a single disk block or a small number of disk blocks Instead,assign the relation to a single node,or Replicate relation at all nodes For medium sized relations,choose how many nodes to partition across based on size of relation Large relations typically partitioned across all available nodes. Database System Concepts-7th Edition 21.11 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 21.11 ©Silberschatz, Korth and Sudarshan th Edition Handling Small Relations ▪ Partitioning not useful for small relations which fit into a single disk block or a small number of disk blocks • Instead, assign the relation to a single node, or • Replicate relation at all nodes ▪ For medium sized relations, choose how many nodes to partition across based on size of relation ▪ Large relations typically partitioned across all available nodes