5. Distributed Query Processing Chapter 7 Overview of Query Processing Chapter 8 Query Decomposition and Data Localization
1 5. Distributed Query Processing Chapter 7 Overview of Query Processing Chapter 8 Query Decomposition and Data Localization
Outline Overview of Query Processing(查询处理) Query Decomposition and Localization(查询分解 与定位)
2 Outline Overview of Query Processing (查询处理) Query Decomposition and Localization (查询分解 与定位)
Query Processing High level user query Quer Processor Low level data manipulation commands
3 Query Processing High level user query Low level data manipulation commands Query Processor
Query Processing Components Query language that is used SQL ( Structured Query Language) Query execution methodology the steps that the system goes through in executing high-level (declarative)user queries Query optimization How to determine the best execution plan?
4 Query Processing Components Query language that is used SQL (Structured Query Language) Query execution methodology The steps that the system goes through in executing high-level (declarative) user queries Query optimization How to determine the “best” execution plan?
Query language Tuple calculus:(t F(t) where t is a tuple variable, and f(t is a well formed formula Example Get the numbers and names of all managers tENo, ENAMeIt E EMP At(TITLE)="MANAGER")
5 Tuple calculus: { t | F(t) } where t is a tuple variable, and F(t) is a well formed formula Example: Get the numbers and names of all managers. t(ENO,ENAME)|t EMPt(TITLE) = "MANAGER" Query Language
Query Language(cont) Domain calculus: x,x2,,X, F(x x2, ,x,) where x, is a domain variable, and Fx, x,, x, )is a well formed formula Example Lx, y EMP(x, y, Manager")) Variables are position sensitive
6 Domain calculus: where xi is a domain variable, and is a well formed formula Example: { x, y | EMP(x, y, “Manager") } x1 , x2 , , xn | F(x1 , x2 , , xn ) ( ) n F x , x , , x 1 2 Variables are position sensitive! Query Language (cont.)
Query Language(cont SQL is a tuple calculus language SE工 ECT ENO, ENAME FROM EMP WHERE TITLE=Programmer End user uses non-procedural(declarative) languages to express queries 7
7 SQL is a tuple calculus language. SELECT ENO,ENAME FROM EMP WHERE TITLE=“Programmer” End user uses non-procedural (declarative) languages to express queries. Query Language (cont.)
Query Processing objectives problems Query processor transforms queries into procedural operations to access data in an optimal way Calculus Relational Query Processor Formula (Optimizer) Algebra Operations Distributed query processor has to deal with query decomposition and data localization
8 Query Processing Objectives & Problems Query processor transforms queries into procedural operations to access data in an optimal way. Distributed query processor has to deal with query decomposition and data localization
Centralized Query Processing Alternatives SELEC ENAME FROM EMP卫,ASGG WHERE E. ENO G ENO AND TITLE=manager Strategy1:兀 ENAME O TITLE="manager "AE. ENO=G.ENO EXG Strategy 2: T ENAME E D1 ENO TITLE="manager (G) Strategy 2 avoids Cartesian product, so is " better
9 Centralized Query Processing Alternatives SELECT ENAME FROM EMP E, ASG G WHERE E.ENO = G.ENO AND TITLE=“manager” Strategy 2 avoids Cartesian product, so is “better”. Strategy 1: Strategy 2: ( (E G)) ENAME TITLE = "manager"E.ENO=G.ENO (E (G)) ENAME ENO TITLE = "manager
Distributed Query processing Query processor must consider the communication cost and select the best site The same query example, but relation G and e are fragmented and distributed (G)|G1 OENO E3 (G)G2 ENO<"E3" Sitel Site ENOs"E3″ (E) E) El OENO"E3 E2 Site3
10 Distributed Query Processing Query processor must consider the communication cost and select the best site. The same query example, but relation G and E are fragmented and distributed