正在加载图片...
Definition 1(User Profile). An user profile, u is a set of binary tuples ((t1, w1),..., (tn, wn)) where t; are the terms that describes the user and w; denotes the importance of ti in describing the user: We use terms(u) to denote the set of terms ti in the profile Cosine Similarity: The BOw representation is typically used for computing cosine similarity between the user profiles. If the vector representation of a user profile u is V(u)and the Euclidean length(V(ui )D of an entity u, is vei wi, the similarity simcos(uj, uk)= cos(V(u,),V(uk)) V(uy)·V(uk) v(au)川V(uk) Spreading: Spreading is the process of including the terms that are related to the original terms in an user profile by referring to an ontology. Let us study the earlier mentioned simple example of two users having google and yahoo in their profile in detail to understand the spreading process bette Example 1. Consider computing the similarity of the following users mple intersection check between the profiles result in an empty set (i.e. u1 nu2=0) indicating their un-relatedness (cosine similarity is O). However, if we were to manually judge the similarity of these two users we would give it a value greater than 0. This is because we judge the similarity not just by considering the two terms from the profiles but also by considering the relationships that might exist between them due to our prior knowledge. We are able to establish the fact that both google and yahoo are search engine providers Now let us see the effectiveness of spreading in the similarity computation process in the same example. Spreading the profiles u1 and u2, by referring to Wikipedia parent ategory relationship, extends the profiles to ui=igoogle, 1.0), internet search engines, 0.5)), and The sim, ((yahoo, 2.0),(internet search engines, 1.0) The simple intersection check results in a non-empty set (i.e.uinu?fO)indicating their relatedness(cosine similarity is 0.2). The result of the spreading(i.e. the inclusion of the related term internet search engines) process makes sure that any relationship that exists between the profiles are taken into consideration 4 Spreading to Create Extended User Profiles In this section, we describe two techniques to compute and represent the extended user profiles(see example of section 3)using an ontology. An ontology O represents human knowledge about a certain domain as concepts, attributes and relationships between concepts in a well-defined hierarchy. It is usually represented as a graph where node are the concepts and edges are the relationship labelled with the type of relationship For the purpose of profile spreading we assume that all the terms ti describing an entity are mappable to concepts in a reference ontology. For example, all the terms ti in a BOW representation of a user profile maps to a concept in the Wordnet ontology. Given a term ti, the spreading process utilizes O to determine the terms that are related to t (denoted as related(ti). Although spreading the profiles with related terms allows forDefinition 1 (User Profile). An user profile, u is a set of binary tuples {ht1, w1i, . . . , htn, wni} where ti are the terms that describes the user and wi denotes the importance of ti in describing the user. We use terms(u) to denote the set of terms ti in the profile u. Cosine Similarity: The BOW representation is typically used for computing cosine similarity between the user profiles. If the vector representation of a user profile uj is −→V (uj ) and the Euclidean length (| −→V (uj )|) of an entity uj is pPn i=1 w2 i , the similarity of the entities uj and uk is (1) simcos(uj , uk) = cos( −→V (uj ), −→V (uk)) = −→V (uj ) · −→V (uk) | −→V (uj )||−→V (uk)| Spreading: Spreading is the process of including the terms that are related to the original terms in an user profile by referring to an ontology. Let us study the earlier mentioned simple example of two users having google and yahoo in their profile in detail to understand the spreading process better. Example 1. Consider computing the similarity of the following users – u1 = {hgoogle, 1.0i}, and – u2 = {hyahoo, 2.0i}. A simple intersection check between the profiles result in an empty set (i.e. u1 ∩ u2 = ∅) indicating their un-relatedness (cosine similarity is 0). However, if we were to manually judge the similarity of these two users we would give it a value greater than 0. This is because we judge the similarity not just by considering the two terms from the profiles but also by considering the relationships that might exist between them due to our prior knowledge. We are able to establish the fact that both google and yahoo are search engine providers. Now let us see the effectiveness of spreading in the similarity computation process in the same example. Spreading the profiles u1 and u2, by referring to Wikipedia parent category relationship, extends the profiles to – u 0 1 = {hgoogle, 1.0i,hinternet search engines, 0.5i}, and – u 0 2 = {hyahoo, 2.0i,hinternet search engines, 1.0i}. The simple intersection check results in a non-empty set (i.e. u 0 1 ∩ u 0 2 6= ∅) indicating their relatedness (cosine similarity is 0.2). The result of the spreading (i.e. the inclusion of the related term internet search engines) process makes sure that any relationship that exists between the profiles are taken into consideration. 4 Spreading to Create Extended User Profiles In this section, we describe two techniques to compute and represent the extended user profiles (see example of section 3) using an ontology. An ontology O represents human knowledge about a certain domain as concepts, attributes and relationships between concepts in a well-defined hierarchy. It is usually represented as a graph where nodes are the concepts and edges are the relationship labelled with the type of relationship. For the purpose of profile spreading we assume that all the terms ti describing an entity are mappable to concepts in a reference ontology. For example, all the terms ti in a BOW representation of a user profile maps to a concept in the Wordnet ontology. Given a term ti , the spreading process utilizes O to determine the terms that are related to ti (denoted as relatedO(ti)). Although spreading the profiles with related terms allows for
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有