94 X. Tang, Q. Zeng / The Journal of _中国高校课件下载中心

点击下载：《电子商务 E-business》阅读文献：Keyword clustering for user interest profiling refinement within paper recommender systems

正在加载图片...

X Tang, Q Zeng/The Joumal of Systems and Software 85(2012)87-101 Weighted keyword graph based interest Projection Extended Subject Ontology Explicit interest User Behavior Database esO Paper Database Fig. 7. Process of generating explicit user interest profiles. Definition 6( Relevance strength). Based on the definitions ofinner gain(IG), mutual information(Mi), chi-square and expected cross edge and cross edge, the relevance strength RS(keyword, topic )of entropy(ECE)(Sebastiani and ricerche, 2002; Yang, 1995 ). Most keyword towards topic can be calculated by of these functions try to capture the intuition according to which IES(keyword the most valuable terms for a certain categorization are those that RS(keyword, topic)= are distributed most differently in the sets of positive and neg- ative examples of the category(Sebastiani and ricerche, 2002). which quantifies the relevance of keyword to topic. From Instead of employing these existing techniques, we propose the the definition of relevance strength we can determine that new measurement since it is expressly designed for our automated RS(keyword, topic e(0, 1 and IES(keyword)>0. classification method and it calculates the tightness between dif- Definition 7(Relevance factor). The relevance factor RFuser. ferent keywords well. The posterior experiment testifies its good pict indicating the measurement of the relevance of users inter- erformance in efficiency and accura est towards topic is given as 5.2. Profiling method for users'implicit interests RF( (user,topi)=(.UG元Gv.i),λ∈[1,+∞), Many researchers have hasized the importance of the in which TGV denotes a vector comprising the weights of all nodes novelty of the recommended items in recommender systems in UTG(topic)in the sequence as same as TGV: A is an adjustable for analyzing recommender systems that considered the"nonob- A should be assigned a value to maximize the recommendation viousness"of the recommendation: novelty and serendipity. Zhou performance. We will discuss this in Section 6. et al. (2010)sought to gain in both diversity and accuracy of mendations, and argued that"real value is found in the ability to The relevance factor quantifies the pertinence of the contents suggest objects users would not readily discover for themselves. of a certain topic that a user accesses to the most essential content that is, in the novelty and diversity of recommendation". They of the topic, with the purpose of measuring the relevance between employed two metrics for measuring recommendation diversit the users interest and the topic. Compared with other profiling strategies, ontology-based pro- Now, we present the algorithm of the explicit interest profiler. filing methods have the advantage that they can infer userinterests Algorithm 2 (Explicit interest The explicit interest of user on the basis of hierarchical structures and are able to utilize the relations between their ontologies and external concepts for ommendations. Due to ontological inference, user profiles can be rounded off and can be matched better to the wide range of user RF(ueop)∑ tonic UIP paper;uge terests(Felden and Linden, 2007). In Middleton et al. (2001, 2003, 2004), researchers used is-a relationships between subclasses and Ar stoni UIP(paperpi, useTo) their immediate super classes by adding 50% of the interest val ues of the classes to those of the super-classes. Trials substantiated where UTS(user)signifies the set of topics that user has explicit that ontological inference boosted the recom Interests in mendation accuracy of individual recommendations To some extent. the measurement of the relevance factor is Instead of taking advantage of the is-a relationship between similar to the feature selection techniques in text classification, subordinate topics and their super-topics, we believe that the which forms the basis of the task. Well-developed feature selec- semantic relations between topics under the same super-topics tion techniques include document frequency(DF). information are more preferable for inferring user interests. In our extended94 X. Tang, Q. Zeng / The Journal of Systems and Software 85 (2012) 87–101 Fig. 7. Process of generating explicit user interest profiles. Definition 6 (Relevance strength). Based on the definitions of inner edge and cross edge, the relevance strength RS(keywordi, topict) of keywordi towards topict can be calculated by RS(keywordi, topict) = IES(keywordi) IES(keywordi) + CES(keywordi) , which quantifies the relevance of keywordi to topict. From the definition of relevance strength we can determine that RS(keywordi, topict) ∈ (0, 1] and IES(keywordi) > 0. Definition 7 (Relevance factor). The relevance factor RF(useru, topict) indicating the measurement of the relevance of useru’s interest towards topict is given as RF(useru, topict) = ((−−→TGV · −−−→ UTGV)/( −−→TGV · 1)) 1/, ∈ [1, +∞), in which −−→TGV denotes a vector comprising the weights of all nodes in TG(topict); −−−→ UTGV denotes a vector listing the weights of all nodes in UTG(topict) in the sequence as same as −−→TGV; is an adjustable parameter. The setting of depends on experimental data, and should be assigned a value to maximize the recommendation performance. We will discuss this in Section 6. The relevance factor quantifies the pertinence of the contents of a certain topic that a user accesses to the most essential content of the topic, with the purpose of measuring the relevance between the user’s interest and the topic. Now, we present the algorithm of the explicit interest profiler. Algorithm 2 (Explicit interest profiler). The explicit interest of useru in topict EI( topict, useru) is computed by EI(topict, useru) = RF(useru, topict) · paperp ∈ topict UIP(paperp, useru) topicti ∈ UTS(useru) RF(useru, topicti) paperpi ∈ topicti UIP(paperpi, useru) , where UTS(useru) signifies the set of topics that useru has explicit interests in. To some extent, the measurement of the relevance factor is similar to the feature selection techniques in text classification, which forms the basis of the task. Well-developed feature selection techniques include document frequency (DF), information gain (IG), mutual information (MI), chi-square and expected cross entropy (ECE) (Sebastiani and Ricerche, 2002; Yang, 1995). Most of these functions try to capture the intuition according to which the most valuable terms for a certain categorization are those that are distributed most differently in the sets of positive and negative examples of the category (Sebastiani and Ricerche, 2002). Instead of employing these existing techniques, we propose the new measurement since it is expressly designed for our automated classification method and it calculates the tightness between different keywords well. The posterior experiment testifies its good performance in efficiency and accuracy. 5.2. Profiling method for users’ implicit interests Many researchers have emphasized the importance of the novelty of the recommended items in recommender systems. Herlocker et al. (2004) asserted that new dimensions were needed for analyzing recommender systems that considered the “nonobviousness” of the recommendation: novelty and serendipity. Zhou et al. (2010) sought to gain in both diversity and accuracy of recommendations, and argued that “real value is found in the ability to suggest objects users would not readily discover for themselves, that is, in the novelty and diversity of recommendation”. They employed two metrics for measuring recommendation diversity. Compared with other profiling strategies, ontology-based pro- filing methods have the advantage that they can infer user interests on the basis of hierarchical structures and are able to utilize the relations between their ontologies and external concepts for recommendations. Due to ontological inference, user profiles can be rounded off and can be matched better to the wide range of user interests (Felden and Linden, 2007). In Middleton et al. (2001, 2003, 2004), researchers used is–a relationships between subclasses and their immediate super classes by adding 50% of the interest values of the classes to those of the super-classes. Trials substantiated that ontological inference boosted the recommendation accuracy of individual recommendations. Instead of taking advantage of the is–a relationship between subordinate topics and their super-topics, we believe that the semantic relations between topics under the same super-topics are more preferable for inferring user interests. In our extended

<<向上翻页向下翻页>>

点击下载：《电子商务 E-business》阅读文献：Keyword clustering for user interest profiling refinement within paper recommender systems