当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

《数据库系统概念 Database System Concepts》原书教学资源(第七版,PPT课件讲稿,英文版)Chapter 22 Parallel and Distributed Query Processing

资源类别:文库,文档格式:PPTX,文档页数:76,文件大小:2.24MB,团购合买
▪ Overview ▪ Parallel Sort ▪ Parallel Join ▪ Other Operations ▪ Parallel Evaluation of Query Plans ▪ Query Processing on Shared Memory ▪ Query Optimization ▪ Distributed Query Processing
点击下载完整版文档(PPTX)

Chapter 22:Parallel And Distributed Query Processing ■Overview ■Parallel Sort ■Parallel Join ■Other Operations Parallel Evaluation of Query Plans Query Processing on Shared Memory Query Optimization Distributed Query Processing Database System Concepts-7th Edition 22.2 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.2 ©Silberschatz, Korth and Sudarshan th Edition Chapter 22: Parallel And Distributed Query Processing ▪ Overview ▪ Parallel Sort ▪ Parallel Join ▪ Other Operations ▪ Parallel Evaluation of Query Plans ▪ Query Processing on Shared Memory ▪ Query Optimization ▪ Distributed Query Processing

Parallel Query Processing Different queries/transactions can be run in parallel with each other. Interquery parallelism Concurrency control takes care of conflicts in case of updates More on parallel transaction processing in Chapter 23 Focus in this chapter is on read-only queries Individual relational operations (e.g.,sort,join,aggregation)can be executed in parallel data can be partitioned and each processor can work independently on its own partition. Queries are expressed in high level language (SQL,translated to relational algebra) makes parallelization easier. Database System Concepts-7th Edition 22.3 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.3 ©Silberschatz, Korth and Sudarshan th Edition Parallel Query Processing ▪ Different queries/transactions can be run in parallel with each other. • Interquery parallelism • Concurrency control takes care of conflicts in case of updates • More on parallel transaction processing in Chapter 23 • Focus in this chapter is on read-only queries ▪ Individual relational operations (e.g., sort, join, aggregation) can be executed in parallel • data can be partitioned and each processor can work independently on its own partition. ▪ Queries are expressed in high level language (SQL, translated to relational algebra) • makes parallelization easier

Intraquery Parallelism Intraquery parallelism:execution of a single query in parallel on multiple processors/disks;important for speeding up long-running queries. Two complementary forms of intraquery parallelism: Intraoperation Parallelism-parallelize the execution of each individual operation in the query Supports high degree of parallelism Interoperation Parallelism-execute the different operations in a query expression in parallel. Limited degree of parallelism Database System Concepts-7th Edition 22.4 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.4 ©Silberschatz, Korth and Sudarshan th Edition Intraquery Parallelism ▪ Intraquery parallelism: execution of a single query in parallel on multiple processors/disks; important for speeding up long-running queries. ▪ Two complementary forms of intraquery parallelism: • Intraoperation Parallelism– parallelize the execution of each individual operation in the query ▪ Supports high degree of parallelism • Interoperation Parallelism– execute the different operations in a query expression in parallel. ▪ Limited degree of parallelism

Parallel Processing of Relational Operations Our discussion of parallel algorithms assumes: ·read-only queries shared-nothing architecture ·n nodes,N1,,Nn Each assumed to have disks and processors. Initial focus on parallelization to a shared-nothing node Parallel processing within a shared memory/shared disk node discussed later Shared-nothing architectures can be efficiently simulated on shared- memory and shared-disk systems. Algorithms for shared-nothing systems can thus be run on shared-memory and shared-disk systems. However,some optimizations may be possible. Database System Concepts-7th Edition 22.5 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.5 ©Silberschatz, Korth and Sudarshan th Edition Parallel Processing of Relational Operations ▪ Our discussion of parallel algorithms assumes: • read-only queries • shared-nothing architecture • n nodes, N1 , ..., Nn ▪ Each assumed to have disks and processors. • Initial focus on parallelization to a shared-nothing node ▪ Parallel processing within a shared memory/shared disk node discussed later • Shared-nothing architectures can be efficiently simulated on shared￾memory and shared-disk systems. ▪ Algorithms for shared-nothing systems can thus be run on shared-memory and shared-disk systems. ▪ However, some optimizations may be possible

Intraoperation Parallelism Database System Concepts -7th Edition 22.6 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.6 ©Silberschatz, Korth and Sudarshan th Edition Intraoperation Parallelism

Range Partitioning Redistribute using partitioning vector: 100,175,300,500,800 0-100) [100-175)[175-300)[300-500)[500-800)[800-1000) Database System Concepts-7th Edition 22.7 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.7 ©Silberschatz, Korth and Sudarshan th Edition Range Partitioning

Range-Partitioning Parallel Sort 1)Redistribute using partitioning vector: 100.175,300,500.800 0-100) [100-175) [175-300)[300-500)[500-800)[800-1000) 2)(External)sort locally at each node 3)Merge if output required at one node Database System Concepts-7th Edition 22.8 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.8 ©Silberschatz, Korth and Sudarshan th Edition Range-Partitioning Parallel Sort

Parallel Sort Local Sort Local Sort Merge r Local Sort Local Sort Merge Local Sort Local Sort Merge : Local Sort Local Sort Merge Im 1.Range Partition 2.Local Sort 1.Local Sort 2.Range Partition and Merge (a)Range Partitioning Sort (b)Parallel External Sort-Merge Database System Concepts-7th Edition 22.9 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.9 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort

Parallel Sort Range-Partitioning Sort ■ Choose nodes N,...,Nmm,where m n-1 to do sorting. Create range-partition vector with m-1 entries,on the sorting attributes Redistribute the relation using range partitioning Each node N sorts its partition of the relation locally. Example of data parallelism:each node executes same operation in parallel with other nodes,without any interaction with the others. Final merge operation is trivial:range-partitioning ensures that, if i<j,all key values in node Nare all less than all key values in N Database System Concepts-7th Edition 22.10 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.10 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort Range-Partitioning Sort ▪ Choose nodes N1 , ..., Nm, where m  n -1 to do sorting. ▪ Create range-partition vector with m-1 entries, on the sorting attributes ▪ Redistribute the relation using range partitioning ▪ Each node Ni sorts its partition of the relation locally. • Example of data parallelism: each node executes same operation in parallel with other nodes, without any interaction with the others. ▪ Final merge operation is trivial: range-partitioning ensures that, if i < j, all key values in node Ni are all less than all key values in Nj

Parallel Sort (Cont.) Parallel External Sort-Merge Assume the relation has already been partitioned among nodes N1,...N (in whatever manner). Each node N locally sorts the data (using local disk as required) The sorted runs on each node are then merged in parallel: The sorted partitions at each node Ni are range-partitioned across the processors N1,...,N. Each node Ni performs a merge on the streams as they are received,to get a single sorted run. The sorted runs on nodes N1,...,N are concatenated to get the final result. Algorithm as described vulnerable to execution skew all nodes send to node 1,then all nodes send data to node 2, Can be modified so each node sends data to all other nodes in parallel (block at a time) Database System Concepts-7th Edition 22.11 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 22.11 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort (Cont.) Parallel External Sort-Merge ▪ Assume the relation has already been partitioned among nodes N1 , ..., Nn (in whatever manner). ▪ Each node Ni locally sorts the data (using local disk as required) ▪ The sorted runs on each node are then merged in parallel: • The sorted partitions at each node Ni are range-partitioned across the processors N1 , ..., Nn . • Each node Ni performs a merge on the streams as they are received, to get a single sorted run. • The sorted runs on nodes N1 , ..., Nn are concatenated to get the final result. ▪ Algorithm as described vulnerable to execution skew • all nodes send to node 1, then all nodes send data to node 2, … • Can be modified so each node sends data to all other nodes in parallel (block at a time)

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共76页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有