Chapter 13:Query Optimization Introduction Transformation of Relational Expressions Catalog Information for Cost Estimation Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming for Choosing Evaluation Plans Materialized views Database System Concepts-6th Edition 1.2 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 1.2 ©Silberschatz, Korth and Sudarshan th Edition Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog Information for Cost Estimation Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming for Choosing Evaluation Plans Materialized views
Introduction Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation Π ame,title name,title dept_name Music instructor Odept_name=Music teaches course instructor teaches course Database System Concepts-6th Edition 1.3 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 1.3 ©Silberschatz, Korth and Sudarshan th Edition Introduction Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation
Introduction (Cont.) An evaluation plan defines exactly what algorithm is used for each operation,and how the execution of the operations is coordinated. I(t to remove duplicates) (hash join) (merge join) course pipeline pipeline dept_name-Music 0 year=2009 (use index 1) (use linear scan)) instructor teaches Find out how to view query execution plans on your favorite database Database System Concepts-6th Edition 1.4 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 1.4 ©Silberschatz, Korth and Sudarshan th Edition Introduction (Cont.) An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated. Find out how to view query execution plans on your favorite database
Introduction(Cont.) Cost difference between evaluation plans for a query can be enormous E.g.seconds vs.days in some cases Steps in cost-based query optimization 1.Generate logically equivalent expressions using equivalence rules 2.Annotate resultant expressions to get alternative query plans 3.Choose the cheapest plan based on estimated cost Estimation of plan cost based on: Statistical information about relations.Examples: number of tuples,number of distinct values for an attribute Statistics estimation for intermediate results to compute cost of complex expressions Cost formulae for algorithms,computed using statistics Database System Concepts-6th Edition 1.5 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 1.5 ©Silberschatz, Korth and Sudarshan th Edition Introduction (Cont.) Cost difference between evaluation plans for a query can be enormous E.g. seconds vs. days in some cases Steps in cost-based query optimization 1. Generate logically equivalent expressions using equivalence rules 2. Annotate resultant expressions to get alternative query plans 3. Choose the cheapest plan based on estimated cost Estimation of plan cost based on: Statistical information about relations. Examples: number of tuples, number of distinct values for an attribute Statistics estimation for intermediate results to compute cost of complex expressions Cost formulae for algorithms, computed using statistics
☒无法显示该图片。 Generating Equivalent Expressions Database System Concepts,6th Ed. @Silberschatz,Korth and Sudarshan See www.db-book.com for conditions on re-use
Database System Concepts, 6th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use Generating Equivalent Expressions
Transformation of Relational Expressions Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every lega/database instance Note:order of tuples is irrelevant we don't care if they generate different results on databases that violate integrity constraints In SQL,inputs and outputs are multisets of tuples Two expressions in the multiset version of the relational algebra are said to be equivalent if the two expressions generate the same multiset of tuples on every legal database instance. An equivalence rule says that expressions of two forms are equivalent Can replace expression of first form by second,or vice versa Database System Concepts-6th Edition 1.7 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 1.7 ©Silberschatz, Korth and Sudarshan th Edition Transformation of Relational Expressions Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every legal database instance Note: order of tuples is irrelevant we don’t care if they generate different results on databases that violate integrity constraints In SQL, inputs and outputs are multisets of tuples Two expressions in the multiset version of the relational algebra are said to be equivalent if the two expressions generate the same multiset of tuples on every legal database instance. An equivalence rule says that expressions of two forms are equivalent Can replace expression of first form by second, or vice versa
Equivalence Rules 1.Conjunctive selection operations can be deconstructed into a sequence of individual selections. o le,(E)=Oe (o,(E)) 2.Selection operations are commutative. o6(o4,(E)=04,(oa(E) 3.( Only the last in a sequence of projection operations is needed,the others can be omitted. Πz(Π,(.(n(E).…)=Π(E) 4.Selections can be combined with Cartesian products and theta joins. a.( a(E1XE2)=E1凶9E2 b.( 01(E1凶92E2)=E1M1A2E2 Database System Concepts-6th Edition 1.8 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 1.8 ©Silberschatz, Korth and Sudarshan th Edition Equivalence Rules 1. Conjunctive selection operations can be deconstructed into a sequence of individual selections. 2. Selection operations are commutative. 3. Only the last in a sequence of projection operations is needed, the others can be omitted. 4. Selections can be combined with Cartesian products and theta joins. a. (E1 X E2 ) = E1 E2 b. 1 (E1 2 E2 ) = E1 1 2 E2 ( ( )) ( ( )) 1 2 2 1 sq sq E =sq sq E ( ) ( ( )) 1 2 1 2 sq Ùq E =sq sq E ( ( ( ( )) )) ( ) 1 2 1 L L Ln E = L E
Equivalence Rules (Cont.) 5.Theta-join operations (and natural joins)are commutative. E1凶gE2=E2凶9E1 6.(a)Natural join operations are associative: (E1凶E2)凶E3=E1凶(E2冈E3) (b)Theta joins are associative in the following manner: (E1凶01E2)凶923E3=E1凶01A03(E2凶92E3) where 02 involves attributes from only E2 and E3. Database System Concepts-6th Edition 1.9 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 1.9 ©Silberschatz, Korth and Sudarshan th Edition Equivalence Rules (Cont.) 5. Theta-join operations (and natural joins) are commutative. E1 E2 = E2 E1 6. (a) Natural join operations are associative: (E1 E2 ) E3 = E1 (E2 E3 ) (b) Theta joins are associative in the following manner: (E1 1 E2 ) 2 3 E3 = E1 1 3 (E2 2 E3 ) where 2 involves attributes from only E2 and E3
Pictorial Depiction of Equivalence Rules Rule 5 0 E1 E2 E2 E1 凶 Rule 6a 凶 E3 E1 E1 E2 E2 E3 Rule 7a 凶 If 0 only has attributes from El 68 E2 E1 E2 E1 Database System Concepts-6th Edition 1.10 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 1.10 ©Silberschatz, Korth and Sudarshan th Edition Pictorial Depiction of Equivalence Rules
Equivalence Rules (Cont.) 7.The selection operation distributes over the theta join operation under the following two conditions: (a)When all the attributes in 00 involve only the attributes of one of the expressions(E)being joined. oo(E1☒6E2)=(oo(E1)凶gE2 (b)When01 involves only the attributes of E1 and 02 involves only the attributes of E2. o91N2(E1凶9E2)=(o1(E1)凶(o2(E2) Database System Concepts-6th Edition 1.11 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 1.11 ©Silberschatz, Korth and Sudarshan th Edition Equivalence Rules (Cont.) 7. The selection operation distributes over the theta join operation under the following two conditions: (a) When all the attributes in 0 involve only the attributes of one of the expressions (E1 ) being joined. 0 (E1 E2 ) = (0 (E1 )) E2 (b) When 1 involves only the attributes of E1 and 2 involves only the attributes of E2 . 1 (E1 E2 ) = (1 (E1 )) ( (E2 ))