Chapter 24:Advanced Application Development Performance Tuning Performance Benchmarks Standardization E-Commerce Legacy Systems Database System Concepts-6th Edition 24.2 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.2 ©Silberschatz, Korth and Sudarshan th Edition Chapter 24: Advanced Application Development Performance Tuning Performance Benchmarks Standardization E-Commerce Legacy Systems
Performance Tuning Adjusting various parameters and design choices to improve system performance for a specific application. Tuning is best done by 1.identifying bottlenecks,and 2.eliminating them. Can tune a database system at 3 levels: Hardware--e.g.,add disks to speed up 1/O,add memory to increase buffer hits,move to a faster processor. Database system parameters--e.g.,set buffer size to avoid paging of buffer,set checkpointing intervals to limit log size. System may have automatic tuning. Higher level database design,such as the schema,indices and transactions(more later) Database System Concepts-6th Edition 24.3 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.3 ©Silberschatz, Korth and Sudarshan th Edition Performance Tuning Adjusting various parameters and design choices to improve system performance for a specific application. Tuning is best done by 1. identifying bottlenecks, and 2. eliminating them. Can tune a database system at 3 levels: Hardware -- e.g., add disks to speed up I/O, add memory to increase buffer hits, move to a faster processor. Database system parameters -- e.g., set buffer size to avoid paging of buffer, set checkpointing intervals to limit log size. System may have automatic tuning. Higher level database design, such as the schema, indices and transactions (more later)
Bottlenecks Performance of most systems(at least before they are tuned)usually limited by performance of one or a few components:these are called bottlenecks E.g.,80%of the code may take up 20%of time and 20%of code takes up 80%of time Worth spending most time on 20%of code that take 80%of time Bottlenecks may be in hardware (e.g.,disks are very busy,CPU is idle),or in software Removing one bottleneck often exposes another De-bottlenecking consists of repeatedly finding bottlenecks,and removing them This is a heuristic Database System Concepts-6th Edition 24.4 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.4 ©Silberschatz, Korth and Sudarshan th Edition Bottlenecks Performance of most systems (at least before they are tuned) usually limited by performance of one or a few components: these are called bottlenecks E.g., 80% of the code may take up 20% of time and 20% of code takes up 80% of time Worth spending most time on 20% of code that take 80% of time Bottlenecks may be in hardware (e.g., disks are very busy, CPU is idle), or in software Removing one bottleneck often exposes another De-bottlenecking consists of repeatedly finding bottlenecks, and removing them This is a heuristic
Identifying Bottlenecks Transactions request a sequence of services E.g.,CPU,Disk I/O,locks With concurrent transactions,transactions may have to wait for a requested service while other transactions are being served Can model database as a queueing system with a queue for each service Transactions repeatedly do the following request a service,wait in queue for the service,and get serviced Bottlenecks in a database system typically show up as very high utilizations(and correspondingly,very long queues)of a particular service E.g.,disk vs.CPU utilization 100%utilization leads to very long waiting time: Rule of thumb:design system for about 70%utilization at peak load utilization over 90%should be avoided Database System Concepts-6th Edition 24.5 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.5 ©Silberschatz, Korth and Sudarshan th Edition Identifying Bottlenecks Transactions request a sequence of services E.g., CPU, Disk I/O, locks With concurrent transactions, transactions may have to wait for a requested service while other transactions are being served Can model database as a queueing system with a queue for each service Transactions repeatedly do the following request a service, wait in queue for the service, and get serviced Bottlenecks in a database system typically show up as very high utilizations (and correspondingly, very long queues) of a particular service E.g., disk vs. CPU utilization 100% utilization leads to very long waiting time: Rule of thumb: design system for about 70% utilization at peak load utilization over 90% should be avoided
Queues In A Database System concurrency-control manager lock lock request grant CPU manager transaction transaction source manager transaction page monitor page reply request page disk manager buffer request manager page reply Database System Concepts-6th Edition 24.6 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.6 ©Silberschatz, Korth and Sudarshan th Edition Queues In A Database System
Tunable Parameters Tuning of hardware Tuning of schema Tuning of indices Tuning of materialized views Tuning of transactions Database System Concepts-6th Edition 24.7 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.7 ©Silberschatz, Korth and Sudarshan th Edition Tunable Parameters Tuning of hardware Tuning of schema Tuning of indices Tuning of materialized views Tuning of transactions
Tuning of Hardware Even well-tuned transactions typically require a few l/O operations Typical disk supports about 100 random l/O operations per second Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per second,we need to stripe data across n/50 disks (ignoring skew) Number of I/O operations per transaction can be reduced by keeping more data in memory If all data is in memory,I/O needed only for writes Keeping frequently used data in memory reduces disk accesses, reducing number of disks required,but has a memory cost Database System Concepts-6th Edition 24.8 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.8 ©Silberschatz, Korth and Sudarshan th Edition Tuning of Hardware Even well-tuned transactions typically require a few I/O operations Typical disk supports about 100 random I/O operations per second Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per second, we need to stripe data across n/50 disks (ignoring skew) Number of I/O operations per transaction can be reduced by keeping more data in memory If all data is in memory, I/O needed only for writes Keeping frequently used data in memory reduces disk accesses, reducing number of disks required, but has a memory cost
Hardware Tuning:Five-Minute Rule Question:which data to keep in memory: If a page is accessed n times per second,keeping it in memory saves n* price-per-disk-drive accesses-per-second-per-disk Cost of keeping page in memory price-per-MB-of-memory ages-per-MB-of-memory Break-even point:value of n for which above costs are equal If accesses are more then saving is greater than cost Solving above equation with current disk and memory prices leads to: 5-minute rule:if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in memory (by buying sufficient memory!) Database System Concepts-6th Edition 24.9 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.9 ©Silberschatz, Korth and Sudarshan th Edition Hardware Tuning: Five-Minute Rule Question: which data to keep in memory: If a page is accessed n times per second, keeping it in memory saves n * price-per-disk-drive accesses-per-second-per-disk Cost of keeping page in memory price-per-MB-of-memory ages-per-MB-of-memory Break-even point: value of n for which above costs are equal If accesses are more then saving is greater than cost Solving above equation with current disk and memory prices leads to: 5-minute rule: if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in memory (by buying sufficient memory!)
Hardware Tuning:One-Minute Rule For sequentially accessed data,more pages can be read per second. Assuming sequential reads of 1MB of data at a time: 1-minute rule:sequentially accessed data that is accessed once or more in a minute should be kept in memory Prices of disk and memory have changed greatly over the years,but the ratios have not changed much So rules remain as 5 minute and 1 minute rules,not 1 hour or 1 second rules! Database System Concepts-6th Edition 24.10 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.10 ©Silberschatz, Korth and Sudarshan th Edition Hardware Tuning: One-Minute Rule For sequentially accessed data, more pages can be read per second. Assuming sequential reads of 1MB of data at a time: 1-minute rule: sequentially accessed data that is accessed once or more in a minute should be kept in memory Prices of disk and memory have changed greatly over the years, but the ratios have not changed much So rules remain as 5 minute and 1 minute rules, not 1 hour or 1 second rules!
Hardware Tuning:Choice of RAID Level To use RAID 1 or RAID 5? Depends on ratio of reads and writes RAID 5 requires 2 block reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r+2w l/O operations per second RAID 5 requires:r+4w l/O operations per second For reasonably large r and w,this requires lots of disks to handle workload RAID 5 may require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5(by using parity,as opposed to the mirroring done by RAID 1)may be illusory! Thumb rule:RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise If you need more disks to handle I/O load,just mirror them since disk capacities these days are enormous! Database System Concepts-6th Edition 24.11 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 24.11 ©Silberschatz, Korth and Sudarshan th Edition Hardware Tuning: Choice of RAID Level To use RAID 1 or RAID 5? Depends on ratio of reads and writes RAID 5 requires 2 block reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r + 2w I/O operations per second RAID 5 requires: r + 4w I/O operations per second For reasonably large r and w, this requires lots of disks to handle workload RAID 5 may require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5 (by using parity, as opposed to the mirroring done by RAID 1) may be illusory! Thumb rule: RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise If you need more disks to handle I/O load, just mirror them since disk capacities these days are enormous!