正在加载图片...
nationspaond antr retes thas can be used witp esis iye rat or e relations and attributes can be renamed accordingly. We can use Probability[A, B, RI(t, s)=oRA=LLAI(R)DAB OR A=sAJ(R)I a rename operator PIR(B1,.Bn)(R)that produces a relation JoR.A=s(Aj(R) identical to but with name R and attributes. in order. named This function computes the number of pairs of tuples from a re B1,... Bn. For example, we can rename COmments to Comments in lation R with the same value on an attribute B and with t(a as order to have the relation Ext Students as shown in Figure 3 the value of the attribute A in one of the pair's tuples and s(A Ext_Students: T(SMD, Name)(Students)E P[Comments(PComments) as the value in the other pair tuple divided by the number of tu- les in R with s a as the value of the attribute A. For example 3.4 The Recommend operator Probability[CourselD, SulD, StudentHistory(t, s)finds the similarity etween two courses based on the number of students (identified by Recommendations are based on comparisons(e.g rated by comparing their topics to a student 's interests, students are their SulD)that took both courses t and s(identified by their course mpared to a particular student based on their ratings in order to id CourselD)divided by the total number of students that took s find similar students, and so forth). In order to"build" recommen- for comparing two extended values, each one belonging to one of lations, we will have a library of c on les omparison functions that imple- the input tuples, using some metric, such as the Euclidean distance, ks of the form defined belo Pearson coefficient. etc. DEFINITION 3. A comparison function cf is a parameterized function that maps a tuple t to a single scalar value u by comparing it to another tuple s. Euclidean(E, A1,A2(t,s=2([]-jlA2) u=cfIP(t, s) i∈fE 135=1 where t and s are the two inputs to be compared and P denotes the parameters used in the function. This function compares two tuples t and s on their extended at- tribute E by summing up the squared differences of the values of For example, we may wish to apply a comparison function to attribute A, in all tuples i and j from the nested relations t(E ome part of two tuples, i.e to an attribute A of them; then, we and s[E], respectively, that join on Al. Then, we could, for in- would specify cf[](t, s). It could also be to a set of attributes for stance, use Euclidean(Comments, CourselD, RatingI(t, s)to compare multi-criteria recommendations [4]. We do not make any particular students based on their common ratings, where students t ands are ssumptions for the tuples t and s, e.g, that they should have the taken from Ext._Students ame number of attributes or that corresponding attributes should ave the same type, allowing in this way a wide spectrum of func- Comparisons of single values to extended values. We could defin tions to be defined, as the forthcoming examples will illum a comparison function that compares two values of different e. g, a single value to an extended value, i. e, to a relation E. For function that computes a relationship between its two inputs based example, we could define a function that tries to locate a single n either probabilistic approaches(e.g,[14 )or more traditional tem-to-item correlations. such as pearsons correlation Kendalls attribute B in the tuple where the value was found, i.e. tau, Euclidean distance, and Jaccard index, as well as user-defined Identify[A,E,B(t,s)=叫[Bif彐∈sEst.t4]=v[4 lomain-specific metrics. It could also be a prediction model learn For instance. if t is a course and s is a student from Ext Students offline(e.g, (81). We illustrate several possibilities below Identify[CourselD, Comments, Ratingl(t, s)returns the rating of Comparisons of string values. For example, we could consider dent s for course t based on the course id. In Section 3.6, we will a function that measures the Jaccard similarity between two string ee examples that illustrate how these functions can be used for values, each one belonging to one of the input tuples of the function generating recommendations and treated, for the purposes of comparison, as a bag of words, i.e Comparison functions compare one tuple to another tuple. It is Jaccard[4(t,s)=14】ns4 frequently desirable to compare one tuple to a set of tuples, e. g |t[4Us[4]l a student to a group of students. For this purpose, we define an Using this function, we could, for example, compare course de- aggregation comparison function scriptions using Jaccard[ DescriptionI(t, s), where t and s represent DEFINITION 4. An aggregation comparison function a is a pa- ourses, or compare instructors on the basis of their names with rameterized function that maps a tuple t to a single scalar value Jaccard[Namel(t, s), where t and s are two tuples in Instructors by comparing it to a set of tuples s Comparisons of numerical values. For example, we could use a iunction that measures the distance of two numbers. i.e.. u=aef, PI(t, S)=aPef(t, si)ls: E Sn) Essentially, a takes as parameter the comparison func Distance[A(t, s)=t[A-s[Al the tuple-to-tuple comparisons and generates a list of comparisor Such a function could be used to compare students based on their function values cf(t, s1),cf(t, sn), Vsi E S, which are then GPAs,i.e,Distance (GPAl(t, s), where t and s belong to Students, combined through a into a single value or to compare courses based on their units, i.e., Distance[ Units(t, s) An aggregation comparison function combines all partial values where t and s represent courses. into a final one using a function, such as the maximum, the average. Using conditional probabilities. An alternate way of computing and so forth, and essentially signifies the relationship of tuple t the similarity between two tuples is to use a measure that is based to the set of tuples s. P denotes other parameters to be used in of th the function. For example, we may wish to compute the weig that the other tuple has already been observed. For example, a sim- average of the partial comparison values using as weights the ple comparison function that computes conditional probability is of attribute A of the tuples in Snot impose any real constraints on its expressivity or on the re￾lations and attributes that can be used with this operator because relations and attributes can be renamed accordingly. We can use a rename operator ρ[R 0 (B1, ...Bn)](R) that produces a relation identical to R but with name R 0 and attributes, in order, named B1, ... Bn. For example, we can rename PComments to Comments in order to have the relation Ext_Students as shown in Figure 3: Ext_Students := π{SuID,Name}(Students) ε ρ[Comments(PComments)] 3.4 The Recommend Operator Recommendations are based on comparisons (e.g., courses are rated by comparing their topics to a student’s interests, students are compared to a particular student based on their ratings in order to find similar students, and so forth). In order to “build” recommen￾dations, we will have a library of comparison functions that imple￾ment common recommendation tasks of the form defined below. DEFINITION 3. A comparison function cf is a parameterized function that maps a tuple t to a single scalar value u by comparing it to another tuple s. u = cf[P](t, s) where t and s are the two inputs to be compared and P denotes the parameters used in the function. For example, we may wish to apply a comparison function to some part of two tuples, i.e., to an attribute A of them; then, we would specify cf[A](t, s). It could also be to a set of attributes for multi-criteria recommendations [4]. We do not make any particular assumptions for the tuples t and s, e.g., that they should have the same number of attributes or that corresponding attributes should have the same type, allowing in this way a wide spectrum of func￾tions to be defined, as the forthcoming examples will illuminate. Also, we do not assume any properties for cf. It could be any function that computes a relationship between its two inputs based on either probabilistic approaches (e.g., [14]) or more traditional item-to-item correlations, such as Pearson’s correlation, Kendall’s tau, Euclidean distance, and Jaccard index, as well as user-defined, domain-specific metrics. It could also be a prediction model learnt offline (e.g., [8]). We illustrate several possibilities below. • Comparisons of string values. For example, we could consider a function that measures the Jaccard similarity between two string values, each one belonging to one of the input tuples of the function and treated, for the purposes of comparison, as a bag of words, i.e.: Jaccard[A](t, s) = |t[A] T s[A]| |t[A] S s[A]| Using this function, we could, for example, compare course de￾scriptions using Jaccard[Description](t, s), where t and s represent courses, or compare instructors on the basis of their names with Jaccard[Name](t, s), where t and s are two tuples in Instructors. • Comparisons of numerical values. For example, we could use a function that measures the distance of two numbers, i.e.: Distance[A](t, s) = t[A] − s[A] Such a function could be used to compare students based on their GPAs, i.e., Distance[GPA](t, s), where t and s belong to Students, or to compare courses based on their units, i.e., Distance[Units](t, s), where t and s represent courses. • Using conditional probabilities. An alternate way of computing the similarity between two tuples is to use a measure that is based on the conditional probability of observing one of the tuples given that the other tuple has already been observed. For example, a sim￾ple comparison function that computes conditional probability is: Probability[A, B, R](t, s) = |σR.A=t[A] (R) ./B σR.A=s[A] (R)| |σR.A=s[A] (R)| This function computes the number of pairs of tuples from a re￾lation R with the same value on an attribute B and with t[A] as the value of the attribute A in one of the pair’s tuples and s[A] as the value in the other pair tuple divided by the number of tu￾ples in R with s[A] as the value of the attribute A. For example, Probability[CourseID, SuID, StudentHistory](t, s) finds the similarity between two courses based on the number of students (identified by their SuID) that took both courses t and s (identified by their course id CourseID) divided by the total number of students that took s. • Comparisons of extended values. We could also define functions for comparing two extended values, each one belonging to one of the input tuples, using some metric, such as the Euclidean distance, Pearson coefficient, etc. For example, a comparison function using the Euclidean distance is: Euclidean[E, A1, A2](t, s) = vuuuut X i∈t[E] j∈s[E] i[A1]=j[A1] (i[A2] − j[A2])2 This function compares two tuples t and s on their extended at￾tribute E by summing up the squared differences of the values of attribute A2 in all tuples i and j from the nested relations t[E] and s[E], respectively, that join on A1. Then, we could, for in￾stance, use Euclidean[Comments, CourseID, Rating](t, s) to compare students based on their common ratings, where students t and s are taken from Ext_Students. • Comparisons of single values to extended values. We could define a comparison function that compares two values of different types, e.g., a single value to an extended value, i.e., to a relation E. For example, we could define a function that tries to locate a single attribute value within E, and if it finds it, it returns the value of an attribute B in the tuple where the value was found, i.e.: Identify[A, E, B](t, s) = v[B] if ∃v ∈ s[E] s.t. t[A] = v[A] For instance, if t is a course and s is a student from Ext_Students, Identify[CourseID, Comments, Rating](t, s) returns the rating of stu￾dent s for course t based on the course id. In Section 3.6, we will see examples that illustrate how these functions can be used for generating recommendations. Comparison functions compare one tuple to another tuple. It is frequently desirable to compare one tuple to a set of tuples, e.g., a student to a group of students. For this purpose, we define an aggregation comparison function. DEFINITION 4. An aggregation comparison function a is a pa￾rameterized function that maps a tuple t to a single scalar value u by comparing it to a set of tuples S: u = a[cf, P](t, S) = a[P]({cf(t, si)|si ∈ S}) Essentially, a takes as parameter the comparison function to use for the tuple-to-tuple comparisons and generates a list of comparison function values cf(t, s1), ...cf(t, sn), ∀si ∈ S, which are then combined through a into a single value. An aggregation comparison function combines all partial values into a final one using a function, such as the maximum, the average, and so forth, and essentially signifies the relationship of tuple t to the set of tuples S. P denotes other parameters to be used in the function. For example, we may wish to compute the weighted average of the partial comparison values using as weights the values of attribute A of the tuples in S:
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有