正在加载图片...
Quer 1: Example 5(cont'd): We consider the expression tree corre- SELECT tt.SulD sponding to Example 5 in Figure 5. Recall that this tree corresponds FROM Comn to a workflow that combines recommendations based on a student's tt.CourselD-I2 Coursed AND L2. uD=444 AND t1 SulD e 444 history with course recommendations from other students. The rec- GROUP BY t1SulD ommendation plan for Example 3 is part of the plan for this exam- ple, with the only difference that the results of Query 3 in Figure 7 are materialized as an in-memory table(for the reasons explained CREATE TEMPORARY TABLE SELECT SUMscore'rating/SUM(score)As CScore earlier), and they are not ordered in this phase, since they are not the ⊥ FROM temp2, Couses final results yet. Let us that this materialized table is named SELECT t1. score WHERE temp2. CourselD-Courses CourseID OtherRecs and that a second materialized table, History Recs, hold FROM Comments t1, temp GROUP BY CouseD the recommendations based on the student history(corresponding to Example 2 in Figure 5). Query 4 in Figure 7 shows how the Query 4: blending method Wave_ Blend can be implemented with the help of SQL. Scores in Other Recs are given a greater weight than scores FROM SELECT Coursed, 0. 7CScore AS Score FROM History Recs UNIONALL in History Recs. Note that all queries are built on the fly based on (SELECT CourselD, 1.0"CScore AS Score FROM OeherRecs) the workflow. Hence, the weights are not hard-coded but they com- prise inputs of the blending method specified in the workflow. This query generates the final recommendations returned by the system. which may be additionally ordered for presentation purposes Figure 7: An example recommendation plan As we have seen in the examples above, for each recommend operator, the system builds a set of queries that implement the responding subtree contains select, project, and extend requested comparison function and group together any operators operators, which are all combined with the root recommend oper- that are found in the subtree under this recommend operator. We ator into one SQL query (Query 1 in Figure 7). This query has support an extensible library of (aggregation) comparison func- tions. Interestingly, a large number of functions, such as Pearson, several parts(shown shaded in Query 1), each one mapping to one Jaccard, Cosine, W_Avg, Avg can be impl ditions in the WhERE clause, (b)the recommend operator, which database(as e.g. in Query I in Figure 7. a q i mentedby com- the operators: (a)the select operators have been included as con ting and combining aggregations su y the underlying res students using the inverse Euclidean comparison functio ch cases, the part the recommendation plan that corresponds to the recommend oper on functions that are supported by the underlying database, (c)the atore mainly aggregate queries. The standard query oper extend operator is implemented by a GROUP BY clause. Observe ators, such as selections and projections, are mapped to appropriate how, in this example, the query does not join the relations Students SQL clauses, which are inserted into these queries Whenever there is an extend operator, no extended relation is ac- and Comments specified in the expression tree, because the sys em can recognize that all the attributes required by the extend and tually materialized in memory because that would require fetching recommend operators can be provided by the latter relation. This tribute, and executing a(possibly large) number of queries in order query creates a temporary in-memory table that contains two at- tributes for each student. the student id and a score to populate this attribute with joining tuples(e.g, courses)from tions based on the ratings provided by the similar users found in cessing, the joins implied by an extend operator are executed only The higher recommend operator rse recommenda- another relation. To save unnecessary 1/o operations and tuple pro- the previous step, ie it makes use of the materialized results of when tuples are actually requested by some upper in the expression the lower recommend operator. Queries 2 and 3 correspond to tree operator(typically a recommend operator) and the informa- this operator and use results generated by lower operators. Since tion related to a single entity is grouped together using SQL. For the result of the previous recommend operator has been generated instance, Query 1, which performs the necessary aggregations re- by aggregating the student ratings, we need again to associate stu- quired by the respective recommend operator, also realizes the ex- dents with their ratings in order to compute course recommenda- end operator that is in the subtree of the recommend operator as a tions combining student scores and ratings. Therefore, Query 2 In addition. as we have seen. if there are more than one recom- combines for each student the score and the ratings information mend or or if a recommend operator is followed by a blend one relation. Then, Query 3 contains a hidden extend operator in an expression tree, then a separate set of queries is built for each perator, which now uses a comparison function(Identify) of these operators. The results generated by the intermediate rec- ommend operators are materialized in order to avoid building com- and an aggregation comparison function(W_Avg), are again real- plicated, possibly nested, queries and reuse results of earlier com- ized by leveraging the database's existing aggregation capabilities The course recommendations may be ordered by their score for stations. The output of an intermediate recommend operator is presenting them to the user La. sedory relation with two attributes: a tuple id and a score Instead of having Query 2 generating an in y table and Blend operators are treated in a similar way as recommend ope tors. A set of queries is built that implement the requested blend equivalent,query.Splitting the computations in simpler queries and tions and projections, that are found in the subtree under a blend exploiting in-memory tables allows more efficient recommendation operator. We support an extensible set of blending methods. Again, we can leverage the expressivity of SQL joins and aggregations to The next example shows he the parts of the recommendation plan that refer to this operator. are materialized as in the case of the recommend operal on tree mented on top of a database For the sake of brevity, we only detail sults of blend operators that are internal nodes in an expressicCREATE TEMPORARY TABLE temp SELECT t1.SuID, 1/SQRT(SUM((t1.Rating - t2.Rating) * (t1.Rating - t2.Rating))) as score FROM Comments t1, Comments t2 WHERE t1.CourseID = t2.CourseID AND t2.SuID = 444 AND t1.SuID <> 444 GROUP BY t1.SuID Query 1: CREATE TEMPORARY TABLE temp2 SELECT t1.*, score FROM Comments t1, temp WHERE t1.SuID = temp.SuID; Query 2: SELECT Courses.*, SUM(score*rating)/SUM(score) AS CScore FROM temp2, Courses WHERE temp2.CourseID=Courses.CourseID GROUP BY CourseID ORDER BY CScore Query 3: Query 4: SELECT Courses.*, SUM(Score )/1.7 AS BScore FROM ((SELECT CourseID, 0.7*CScore AS Score FROM HistoryRecs) UNION ALL (SELECT CourseID, 1.0*CScore AS Score FROM OtherRecs)) t1, Courses WHERE t1.CourseID=Courses.CourseID GROUP BY CourseID ORDER BY BScore Figure 7: An example recommendation plan The corresponding subtree contains select, project, and extend operators, which are all combined with the root recommend oper￾ator into one SQL query (Query 1 in Figure 7). This query has several parts (shown shaded in Query 1), each one mapping to one of the operators: (a) the select operators have been included as con￾ditions in the WHERE clause, (b) the recommend operator, which compares students using the inverse Euclidean comparison function on their course ratings, is implemented by combining the aggrega￾tion functions that are supported by the underlying database, (c) the extend operator is implemented by a GROUP BY clause. Observe how, in this example, the query does not join the relations Students and Comments specified in the expression tree, because the sys￾tem can recognize that all the attributes required by the extend and recommend operators can be provided by the latter relation. This query creates a temporary in-memory table that contains two at￾tributes for each student: the student id and a score. The higher recommend operator computes course recommenda￾tions based on the ratings provided by the similar users found in the previous step, i.e., it makes use of the materialized results of the lower recommend operator. Queries 2 and 3 correspond to this operator and use results generated by lower operators. Since the result of the previous recommend operator has been generated by aggregating the student ratings, we need again to associate stu￾dents with their ratings in order to compute course recommenda￾tions combining student scores and ratings. Therefore, Query 2 combines for each student the score and the ratings information into one relation. Then, Query 3 contains a hidden extend operator in its GROUP BY clause. The computations required by the recom￾mend operator, which now uses a comparison function (Identify) and an aggregation comparison function (W_Avg), are again real￾ized by leveraging the database’s existing aggregation capabilities. The course recommendations may be ordered by their score for presenting them to the user. Instead of having Query 2 generating an in-memory table and then Query 3 using this table, we could have generated a single, equivalent, query. Splitting the computations in simpler queries and exploiting in-memory tables allows more efficient recommendation processing over the DB engine that we use. The next example shows how the blend operator can be imple￾mented on top of a database. For the sake of brevity, we only detail the parts of the recommendation plan that refer to this operator. Example 5 (cont0 d): We consider the expression tree corre￾sponding to Example 5 in Figure 5. Recall that this tree corresponds to a workflow that combines recommendations based on a student’s history with course recommendations from other students. The rec￾ommendation plan for Example 3 is part of the plan for this exam￾ple, with the only difference that the results of Query 3 in Figure 7 are materialized as an in-memory table (for the reasons explained earlier), and they are not ordered in this phase, since they are not the final results yet. Let us assume that this materialized table is named OtherRecs and that a second materialized table, HistoryRecs, holds the recommendations based on the student history (corresponding to Example 2 in Figure 5). Query 4 in Figure 7 shows how the blending method Wavg_Blend can be implemented with the help of SQL. Scores in OtherRecs are given a greater weight than scores in HistoryRecs. Note that all queries are built on the fly based on the workflow. Hence, the weights are not hard-coded but they com￾prise inputs of the blending method specified in the workflow. This query generates the final recommendations returned by the system, which may be additionally ordered for presentation purposes. As we have seen in the examples above, for each recommend operator, the system builds a set of queries that implement the requested comparison function and group together any operators that are found in the subtree under this recommend operator. We support an extensible library of (aggregation) comparison func￾tions. Interestingly, a large number of functions, such as Pearson, Jaccard, Cosine, W_Avg, Avg can be implemented by com￾puting and combining aggregations supported by the underlying database (as e.g., in Query 1 in Figure 7). In such cases, the part of the recommendation plan that corresponds to the recommend oper￾ator comprises mainly aggregate queries. The standard query oper￾ators, such as selections and projections, are mapped to appropriate SQL clauses, which are inserted into these queries. Whenever there is an extend operator, no extended relation is ac￾tually materialized in memory because that would require fetching tuples (e.g., students) in memory, augmenting them with a new at￾tribute, and executing a (possibly large) number of queries in order to populate this attribute with joining tuples (e.g., courses) from another relation. To save unnecessary I/O operations and tuple pro￾cessing, the joins implied by an extend operator are executed only when tuples are actually requested by some upper in the expression tree operator (typically a recommend operator) and the informa￾tion related to a single entity is grouped together using SQL. For instance, Query 1, which performs the necessary aggregations re￾quired by the respective recommend operator, also realizes the ex￾tend operator that is in the subtree of the recommend operator as a GROUP BY clause. In addition, as we have seen, if there are more than one recom￾mend operator or if a recommend operator is followed by a blend in an expression tree, then a separate set of queries is built for each of these operators. The results generated by the intermediate rec￾ommend operators are materialized in order to avoid building com￾plicated, possibly nested, queries and reuse results of earlier com￾putations. The output of an intermediate recommend operator is an in-memory relation with two attributes: a tuple id and a score. Blend operators are treated in a similar way as recommend oper￾ators. A set of queries is built that implement the requested blend￾ing method and also combine any standard operators, such selec￾tions and projections, that are found in the subtree under a blend operator. We support an extensible set of blending methods. Again, we can leverage the expressivity of SQL joins and aggregations to implement several methods, as illustrated in Query 4. Finally, re￾sults of blend operators that are internal nodes in an expression tree are materialized as in the case of the recommend operators
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有