88 X. Tang, Q. Zeng / The Journal of _中国高校课件下载中心

点击下载：《电子商务 E-business》阅读文献：Keyword clustering for user interest profiling refinement within paper recommender systems

正在加载图片...

8 in the existing ontologies mavicius and Tuzhilin, 2005; Correa da Silva et al., 2002) rofiling methods based on them (Middleton ork for generating nmender systems. wse, download mernoassnould ugh the user intertace at a of the research papers are88 X. Tang, Q. Zeng / The Journal of Systems and Software 85 (2012) 87–101 construction algorithms for profiling explicit interests and for pro- filing implicit interests are also proposed. Based on our extended subject ontology, we take into account the pertinence of keywords and the classifications of the ontology in the calculation of explicit interest profiles. The method for inferring users’ potential interests (implicit interests) proposed employs a quantified evaluation of relationships between classifications in the ontology. The paper is organized into seven sections. Section 2 discusses the related work. Section 3 presents a brief description of our recommender system, based on which we evaluate our approach. In Section 4, we propose a way of clustering keywords for an existing ontology, in order to create a more detailed and accurate structure of the existing ontology. In Section5, the approaches to user interest profiles, which contain both explicit and implicit interest profiles, are presented. Section 6 introduces our prototype system called SPRS, and using this system we conducted three experiments. We evaluate the subject ontology extension method and refinement of user profiling approach. Section 7 concludes the paper and discusses future work. 2. Related work Ontology is a conceptualization of a domain into a humanunderstandable butmachine-readable format consisting of entities, attributes, relationships, and axioms (Guarino and Giaretta, 1995). It is used to alleviate the communication problems between systems due to ambiguous usage of different terms. For building user profiles, ontologies are used to address the so-called “cold-start problem” (Middleton et al., 2002; Susan and Gauch, 2004).Cantador et al. (2008) mapped social tagging information from multiple sources to their ontological structures that described the domains of interest covered by the tags, in order to build user profiles. Moreover, Middleton et al. (2004) used a term ontology to refer to the classification structure and instances within a knowledge base, representing the profiles in terms of a research paper topic ontology, which allows other interests to be inferred that go beyond those only seen in directly observed behavior. They also employ pro- file visualization to acquire profile feedback from users to improve profiling accuracy. A number of strategies have been implemented to facilitate the construction of ontology. For example, Mika (2007) extended the traditional bipartite model of ontologies with the social dimension, leading to a tripartite model of actors, concepts and instances. He also demonstrated the application of this representation by showing how community-based semantics emerges from this model through a process of graph transformation. In addition, Zhang et al. (2010) proposed a suite of ontology metrics, at both the ontologylevel and the class-level, to measure the design complexity of ontologies. Finally, in recent years complex network and other forms of network models such as bipartite network have been considered to be important aspects of recommender systems by many researchers (Zanin et al., 2008; Zhou et al., 2007, 2010). When focusing on the problem of recommending items to a user, the underlying transaction data can be seen as a bipartite network, in which users and items are represented as two groups of nodes, connected to each other by certain links (Zanin et al., 2008). In order to utilize the bipartite network, a one-mode projecting method is usually implemented as an alternative to using the bipartite network directly. Zhou et al. (2007) raised a novel one-mode projecting method to compress the bipartite network and better preserve the original information. Zhou et al. (2010) introduced and used evaluation criteria for the diversification of recommendation, aside from accuracy. They believe the next generation of information filtering methods should focus on not only precision but also diversification, and that a balance between them should be sought. There are some problems in the existing ontologies (Adomavicius and Tuzhilin, 2005; Correa da Silva et al., 2002) and the user interest profiling methods based on them (Middleton et al., 2001, 2003, 2004; Cantador et al., 2008; Susan and Gauch, 2004; Felden and Linden, 2007). (1) Most of the ontologies in use are framed in coarse granularity, making the classification of items obscure and undetermined. Therefore, the effectiveness of user profiling techniques based on such ontologies, no matter how sophisticated the interest profiling algorithms are, would deteriorate because of the coarse classification. (2) Typically, ontologies are usually predefined manually and remain fixed during a certain period of time, which makes them insensitive to new changes. When new research subjects emerge or other changes happen, the modifications must be done by their creators manually. This reaction is slow and tardy and needs human involvement. Moreover, typical ontology-based user interest calculation algorithms compute the interest value on each category for every user; however, these methods are not always effective. For instance, assume that two users, user A and user B have both viewed a certain number of papers in the subject “artificial intelligence” and we obtain the same interest value for their interests in this subject; therefore we believe their interests in “artificial intelligence” are no different. However, in fact some papers viewed by user A are about “support vector machine”, and the other papers viewed by this user are about “genetic algorithm”, whereas the papers viewed by user B are all related to “natural language processing”. In this case, the relatively subtle differences between the users’ interests in “artificial intelligence” are not discovered, leading to the description of user interests and the subsequent recommendation being inaccurate. (3) It is difficult for conventional approaches to differentiate the items within the same class (or subject). That is, in the case of research paper recommendation, different papers in the same subject contribute essentially equally to the construction of user profiles and the differences in textual information between papers is neglected, which is evidently unreasonable. (4) The inference of topics of interest via ontological relations between topics that have not been browsed explicitly by users is not precise enough. For instance, assume that there are three subclasses A, B and C under their immediate super class A, and a user has an interest in subclass A with an interest value of 0.4. With the aforementioned conventional profiling method, we get 0.2 for the user’s interest value of the super class A. In this scenario, the user will receive equivalent recommendations from subclass B and C, because the conventional profiling method does not take into account the differences between subclass B and C, leading to inaccuracy of the inference about a user’s implicit interest because these differences may be relevant to the user’s preferences. 3. Framework for generating user interest profiles In this section, we first present the framework for generating user interest profiles within online paper recommender systems, which is shown in Fig. 1. The main components in the framework include: (1) Subject ontology. The subject ontology is predefined by domain experts with suitable granularity and scale, which presents the organization structure of domain knowledge and serves as the taxonomy for research papers. Moreover, it is the basis of the user profile. To refine the subject ontology, we will discuss our method of extending the subject ontology through automatically clustering weighted keyword graph in Section 4. (2) Paper management module. Users can upload, browse, download and comment on any research papers through the user interface of the paper management module. All of the research papers are

<<向上翻页向下翻页>>

点击下载：《电子商务 E-business》阅读文献：Keyword clustering for user interest profiling refinement within paper recommender systems