当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

《数据库系统概念 Database System Concepts》原书教学资源(第七版,PPT课件讲稿,英文版)Chapter 15 Query Processing

资源类别:文库,文档格式:PPTX,文档页数:57,文件大小:675.2KB,团购合买
▪ Overview ▪ Measures of Query Cost ▪ Selection Operation ▪ Sorting ▪ Join Operation ▪ Other Operations ▪ Evaluation of Expressions
点击下载完整版文档(PPTX)

Chapter 15:Query Processing Overview Measures of Query Cost ■Selection Operation ■ Sorting ■Join Operation ■Other Operations Evaluation of Expressions Database System Concepts-7th Edition 15.2 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 15.2 ©Silberschatz, Korth and Sudarshan th Edition Chapter 15: Query Processing ▪ Overview ▪ Measures of Query Cost ▪ Selection Operation ▪ Sorting ▪ Join Operation ▪ Other Operations ▪ Evaluation of Expressions

Basic Steps in Query Processing 1.Parsing and translation 2.Optimization 3.Evaluation query parser and relational-algebra translator expression optimizer ! query evaluation engine output execution plan data statistics about data Database System Concepts-7th Edition 15.3 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 15.3 ©Silberschatz, Korth and Sudarshan th Edition Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation

Basic Steps in Query Processing(Cont.) Parsing and translation translate the query into its internal form.This is then translated into relational algebra. Parser checks syntax,verifies relations ■Evaluation The query-execution engine takes a query-evaluation plan, executes that plan,and returns the answers to the query. Database System Concepts-7th Edition 15.4 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 15.4 ©Silberschatz, Korth and Sudarshan th Edition Basic Steps in Query Processing (Cont.) ▪ Parsing and translation • translate the query into its internal form. This is then translated into relational algebra. • Parser checks syntax, verifies relations ▪ Evaluation • The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query

Basic Steps in Query Processing: Optimization A relational algebra expression may have many equivalent expressions E.g.salary75000(saay(instructor))is equivalent to Πsalary(salary<75ooo(instructor)》 Each relational algebra operation can be evaluated using one of several different algorithms Correspondingly,a relational-algebra expression can be evaluated in many ways. Annotated expression specifying detailed evaluation strategy is called an evaluation-plan.E.g.,: Use an index on salary to find instructors with salary 75000. Or perform complete relation scan and discard instructors with salary ≥75000 Database System Concepts-7th Edition 15.5 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.5 ©Silberschatz, Korth and Sudarshan th Edition Basic Steps in Query Processing: Optimization ▪ A relational algebra expression may have many equivalent expressions • E.g., salary75000(salary(instructor)) is equivalent to salary(salary75000(instructor)) ▪ Each relational algebra operation can be evaluated using one of several different algorithms • Correspondingly, a relational-algebra expression can be evaluated in many ways. ▪ Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. E.g.,: • Use an index on salary to find instructors with salary < 75000, • Or perform complete relation scan and discard instructors with salary  75000

Basic Steps:Optimization (Cont.) Query Optimization:Amongst all equivalent evaluation plans choose the one with lowest cost. 。 Cost is estimated using statistical information from the database catalog e.g..number of tuples in each relation,size of tuples,etc. In this chapter we study How to measure query costs Algorithms for evaluating relational algebra operations How to combine algorithms for individual operations in order to evaluate a complete expression In Chapter 16 We study how to optimize queries,that is,how to find an evaluation plan with lowest estimated cost Database System Concepts-7th Edition 15.6 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.6 ©Silberschatz, Korth and Sudarshan th Edition Basic Steps: Optimization (Cont.) ▪ Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. • Cost is estimated using statistical information from the database catalog ▪ e.g.. number of tuples in each relation, size of tuples, etc. ▪ In this chapter we study • How to measure query costs • Algorithms for evaluating relational algebra operations • How to combine algorithms for individual operations in order to evaluate a complete expression ▪ In Chapter 16 • We study how to optimize queries, that is, how to find an evaluation plan with lowest estimated cost

Measures of Query Cost Many factors contribute to time cost disk access,CPU,and network communication Cost can be measured based on response time,i.e.total elapsed time for answering query,or total resource consumption We use total resource consumption as cost metric Response time harder to estimate,and minimizing resource consumption is a good idea in a shared database We ignore CPU costs for simplicity Real systems do take CPU cost into account Network costs must be considered for parallel systems We describe how estimate the cost of each operation We do not include cost to writing output to disk Database System Concepts-7th Edition 15.7 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.7 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost ▪ Many factors contribute to time cost • disk access, CPU, and network communication ▪ Cost can be measured based on • response time, i.e. total elapsed time for answering query, or • total resource consumption ▪ We use total resource consumption as cost metric • Response time harder to estimate, and minimizing resource consumption is a good idea in a shared database ▪ We ignore CPU costs for simplicity • Real systems do take CPU cost into account • Network costs must be considered for parallel systems ▪ We describe how estimate the cost of each operation • We do not include cost to writing output to disk

Measures of Query Cost Disk cost can be estimated as: ·Number of seeks average-seek-cost Number of blocks read average-block-read-cost Number of blocks written average-block-write-cost For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures f-time to transfer one block Assuming for simplicity that write cost is same as read cost ·ts-time for one seek Cost for b block transfers plus S seeks b *t+S *ts 加 fs and f depend on where data is stored;with 4 KB blocks: High end magnetic disk:ts=4 msec and f=0.1 msec SSD:ts 20-90 microsec and t=2-10 microsec for 4KB Database System Concepts-7th Edition 15.8 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.8 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost ▪ Disk cost can be estimated as: • Number of seeks * average-seek-cost • Number of blocks read * average-block-read-cost • Number of blocks written * average-block-write-cost ▪ For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures • tT – time to transfer one block ▪ Assuming for simplicity that write cost is same as read cost • tS – time for one seek • Cost for b block transfers plus S seeks b * tT + S * tS ▪ tS and tT depend on where data is stored; with 4 KB blocks: • High end magnetic disk: tS = 4 msec and tT =0.1 msec • SSD: tS = 20-90 microsec and tT = 2-10 microsec for 4KB

Measures of Query Cost(Cont.) Required data may be buffer resident already,avoiding disk l/O But hard to take into account for cost estimation Several algorithms can reduce disk IO by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes,known only during execution Worst case estimates assume that no data is initially in buffer and only the minimum amount of memory needed for the operation is available But more optimistic estimates are used in practice Database System Concepts-7th Edition 15.9 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.9 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost (Cont.) ▪ Required data may be buffer resident already, avoiding disk I/O • But hard to take into account for cost estimation ▪ Several algorithms can reduce disk IO by using extra buffer space • Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution ▪ Worst case estimates assume that no data is initially in buffer and only the minimum amount of memory needed for the operation is available • But more optimistic estimates are used in practice

Selection Operation ■File scan Algorithm A1 (linear search).Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate b,block transfers+1 seek b,denotes number of blocks containing records from relation r If selection is on a key attribute,can stop on finding record cost=(b,/2)block transfers+1 seek Linear search can be applied regardless of selection condition or ordering of records in the file,or availability of indices Note:binary search generally does not make sense since data is not stored consecutively except when there is an index available, and binary search requires more seeks than index search Database System Concepts-7th Edition 15.10 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.10 ©Silberschatz, Korth and Sudarshan th Edition Selection Operation ▪ File scan ▪ Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. • Cost estimate = brblock transfers + 1 seek ▪ br denotes number of blocks containing records from relation r • If selection is on a key attribute, can stop on finding record ▪ cost = (br /2) block transfers + 1 seek • Linear search can be applied regardless of ▪ selection condition or ▪ ordering of records in the file, or ▪ availability of indices ▪ Note: binary search generally does not make sense since data is not stored consecutively • except when there is an index available, • and binary search requires more seeks than index search

Selections Using Indices Index scan-search algorithms that use an index selection condition must be on search-key of index. A2(clustering index,equality on key).Retrieve a single record that satisfies the corresponding equality condition ·Cost=(h+1)*(tr+ts) A3(clustering index,equality on nonkey)Retrieve multiple records. Records will be on consecutive blocks Let b number of blocks containing matching records Cost=hi *(tr+ts)ts +tr*b Database System Concepts-7th Edition 15.11 ©Silberscha乜,Korth and Sudarshan

Database System Concepts - 7 15.11 ©Silberschatz, Korth and Sudarshan th Edition Selections Using Indices ▪ Index scan – search algorithms that use an index • selection condition must be on search-key of index. ▪ A2 (clustering index, equality on key). Retrieve a single record that satisfies the corresponding equality condition • Cost = (hi + 1) * (tT + tS) ▪ A3 (clustering index, equality on nonkey) Retrieve multiple records. • Records will be on consecutive blocks ▪ Let b = number of blocks containing matching records • Cost = hi * (tT + tS) + tS + tT * b

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共57页,可试读19页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有