Hardware Tuning:Choice of RAID Level ■To use RAID1 or RAID5? Depends on ratio of reads and writes RAID 5 requires 2 block reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r+2w l/O operations per second RAID 5 requires:r+4w 1/O operations per second For reasonably large r and w,this requires lots of disks to handle workload RAID 5 may require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5(by using parity,as opposed to the mirroring done by RAID 1)may be illusory! Thumb rule:RAID 5 is fine when writes are rare and data is very large,but RAID 1 is preferable otherwise If you need more disks to handle 1/O load,just mirror them since disk capacities these days are enormous! Database System Concepts-7th Edition 25.3 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.3 ©Silberschatz, Korth and Sudarshan th Edition Hardware Tuning: Choice of RAID Level ▪ To use RAID 1 or RAID 5? • Depends on ratio of reads and writes ▪ RAID 5 requires 2 block reads and 2 block writes to write out one data block ▪ If an application requires r reads and w writes per second • RAID 1 requires r + 2w I/O operations per second • RAID 5 requires: r + 4w I/O operations per second ▪ For reasonably large r and w, this requires lots of disks to handle workload • RAID 5 may require more disks than RAID 1 to handle load! • Apparent saving of number of disks by RAID 5 (by using parity, as opposed to the mirroring done by RAID 1) may be illusory! ▪ Thumb rule: RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise • If you need more disks to handle I/O load, just mirror them since disk capacities these days are enormous!
Tuning the Database Design (Cont.) Materialized Views Materialized views can help speed up certain queries Particularly aggregate queries ■Overheads ·Space Time for view maintenance Immediate view maintenance:done as part of update txn time overhead paid by update transaction Deferred view maintenance:done only when required update transaction is not affected,but system time is spent on view maintenance until updated,the view may be out-of-date Preferable to denormalized schema since view maintenance is systems responsibility,not programmers Avoids inconsistencies caused by errors in update programs Database System Concepts-7th Edition 25.6 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.6 ©Silberschatz, Korth and Sudarshan th Edition Tuning the Database Design (Cont.) Materialized Views ▪ Materialized views can help speed up certain queries • Particularly aggregate queries ▪ Overheads • Space • Time for view maintenance ▪ Immediate view maintenance: done as part of update txn • time overhead paid by update transaction ▪ Deferred view maintenance: done only when required • update transaction is not affected, but system time is spent on view maintenance ▪ until updated, the view may be out-of-date ▪ Preferable to denormalized schema since view maintenance is systems responsibility, not programmers • Avoids inconsistencies caused by errors in update programs
Tuning the Database Design (Cont.) How to choose set of materialized views Helping one transaction type by introducing a materialized view may hurt others Choice of materialized views depends on costs Users often have no idea of actual cost of operations Overall,manual selection of materialized views is tedious ■ Some database systems provide tools to help DBA choose views to materialize ·“Materialized view selection wizards” Database System Concepts-7th Edition 25.7 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.7 ©Silberschatz, Korth and Sudarshan th Edition Tuning the Database Design (Cont.) ▪ How to choose set of materialized views • Helping one transaction type by introducing a materialized view may hurt others • Choice of materialized views depends on costs ▪ Users often have no idea of actual cost of operations • Overall, manual selection of materialized views is tedious ▪ Some database systems provide tools to help DBA choose views to materialize • “Materialized view selection wizards
Tuning of Transactions (Cont.) Reducing lock contention Long transactions(typically read-only)that examine large parts of a relation result in lock contention with update transactions E.g.,large query to compute bank statistics and regular bank transactions To reduce contention Use multi-version concurrency control ·E.g,Oracle“snapshots”which support multi-version2PL Use degree-two consistency(cursor-stability)for long transactions Drawback:result may be approximate Database System Concepts-7th Edition 25.9 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.9 ©Silberschatz, Korth and Sudarshan th Edition Tuning of Transactions (Cont.) ▪ Reducing lock contention ▪ Long transactions (typically read-only) that examine large parts of a relation result in lock contention with update transactions • E.g., large query to compute bank statistics and regular bank transactions ▪ To reduce contention • Use multi-version concurrency control ▪ E.g., Oracle “snapshots” which support multi-version 2PL • Use degree-two consistency (cursor-stability) for long transactions ▪ Drawback: result may be approximate
Tuning of Transactions (Cont.) Long update transactions cause several problems ·Exhaust lock space Exhaust log space and also greatly increase recovery time after a crash,and may even exhaust log space during recovery if recovery algorithm is badly designed! Use mini-batch transactions to limit number of updates that a single transaction can carry out.E.g.,if a single large transaction updates every record of a very large relation,log may grow too big. Split large transaction into batch of"mini-transactions,"each performing part of the updates Hold locks across transactions in a mini-batch to ensure serializability If lock table size is a problem can release locks,but at the cost of serializability In case of failure during a mini-batch,must complete its remaining portion on recovery,to ensure atomicity. Database System Concepts-7th Edition 25.10 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.10 ©Silberschatz, Korth and Sudarshan th Edition Tuning of Transactions (Cont.) ▪ Long update transactions cause several problems • Exhaust lock space • Exhaust log space ▪ and also greatly increase recovery time after a crash, and may even exhaust log space during recovery if recovery algorithm is badly designed! ▪ Use mini-batch transactions to limit number of updates that a single transaction can carry out. E.g., if a single large transaction updates every record of a very large relation, log may grow too big. • Split large transaction into batch of “mini-transactions,'' each performing part of the updates • Hold locks across transactions in a mini-batch to ensure serializability ▪ If lock table size is a problem can release locks, but at the cost of serializability • In case of failure during a mini-batch, must complete its remaining portion on recovery, to ensure atomicity
Performance Simulation Performance simulation using queuing model useful to predict bottlenecks as well as the effects of tuning changes,even without access to real system Queuing model as we saw earlier Models activities that go on in parallel Simulation model is quite detailed,but usually omits some low level details Model service time,but disregard details of service E.g.,approximate disk read time by using an average disk read time Experiments can be run on model,and provide an estimate of measures such as average throughput/response time Parameters can be tuned in model and then replicated in real system E.g.,number of disks,memory,algorithms,etc. Database System Concepts-7th Edition 25.11 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.11 ©Silberschatz, Korth and Sudarshan th Edition Performance Simulation ▪ Performance simulation using queuing model useful to predict bottlenecks as well as the effects of tuning changes, even without access to real system ▪ Queuing model as we saw earlier • Models activities that go on in parallel ▪ Simulation model is quite detailed, but usually omits some low level details • Model service time, but disregard details of service • E.g., approximate disk read time by using an average disk read time ▪ Experiments can be run on model, and provide an estimate of measures such as average throughput/response time ▪ Parameters can be tuned in model and then replicated in real system • E.g., number of disks, memory, algorithms, etc
Database Application Classes Online transaction processing(OLTP) requires high concurrency and clever techniques to speed up commit processing,to support a high rate of update transactions. Decision support applications including online analytical processing,or OLAP applications require good query evaluation algorithms and query optimization. Architecture of some database systems tuned to one of the two classes E.g.,Teradata is tuned to decision support Others try to balance the two requirements E.g.,Oracle,with snapshot support for long read-only transaction Database System Concepts-7th Edition 25.14 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.14 ©Silberschatz, Korth and Sudarshan th Edition Database Application Classes ▪ Online transaction processing (OLTP) • requires high concurrency and clever techniques to speed up commit processing, to support a high rate of update transactions. ▪ Decision support applications • including online analytical processing, or OLAP applications • require good query evaluation algorithms and query optimization. ▪ Architecture of some database systems tuned to one of the two classes • E.g., Teradata is tuned to decision support ▪ Others try to balance the two requirements • E.g., Oracle, with snapshot support for long read-only transaction
Benchmarks Suites (Cont.) TPC benchmarks (cont.) TPC-D:complex decision support application Superceded by TPC-H and TPC-R TPC-H:(H for ad hoc)based on TPC-D with some extra queries Models ad hoc queries which are not known beforehand Total of 22 queries with emphasis on aggregation prohibits materialized views permits indices only on primary and foreign keys TPC-R:(R for reporting)same as TPC-H,but without any restrictions on materialized views and indices TPC-W:(W for Web)End-to-end Web service benchmark modeling a Web bookstore,with combination of static and dynamically generated pages Database System Concepts-7th Edition 25.16 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.16 ©Silberschatz, Korth and Sudarshan th Edition Benchmarks Suites (Cont.) ▪ TPC benchmarks (cont.) • TPC-D: complex decision support application ▪ Superceded by TPC-H and TPC-R • TPC-H:(H for ad hoc) based on TPC-D with some extra queries ▪ Models ad hoc queries which are not known beforehand • Total of 22 queries with emphasis on aggregation ▪ prohibits materialized views ▪ permits indices only on primary and foreign keys • TPC-R:(R for reporting) same as TPC-H, but without any restrictions on materialized views and indices • TPC-W: (W for Web) End-to-end Web service benchmark modeling a Web bookstore, with combination of static and dynamically generated pages
TPC Performance Measures Two types of tests for TPC-H and TPC-R Power test:runs queries and updates sequentially,then takes mean to find queries per hour Throughput test:runs queries and updates concurrently multiple streams running in parallel each generates queries,with one parallel update stream Composite query per hour metric:square root of product of power and throughput metrics Composite price/performance metric Database System Concepts-7th Edition 25.18 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.18 ©Silberschatz, Korth and Sudarshan th Edition TPC Performance Measures ▪ Two types of tests for TPC-H and TPC-R • Power test: runs queries and updates sequentially, then takes mean to find queries per hour • Throughput test: runs queries and updates concurrently ▪ multiple streams running in parallel each generates queries, with one parallel update stream • Composite query per hour metric: square root of product of power and throughput metrics • Composite price/performance metric
Other Benchmarks OODB transactions require a different set of benchmarks. OO7 benchmark has several different operations,and provides a separate benchmark number for each kind of operation Reason:hard to define what is a typical OODB application Benchmarks for XML being discussed Database System Concepts-7th Edition 25.19 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 25.19 ©Silberschatz, Korth and Sudarshan th Edition Other Benchmarks ▪ OODB transactions require a different set of benchmarks. • OO7 benchmark has several different operations, and provides a separate benchmark number for each kind of operation • Reason: hard to define what is a typical OODB application ▪ Benchmarks for XML being discussed