正在加载图片...
of Athena as a plug-in for the HNP. Then, Sect. 5 gives the 3. AtHENA valuation of the implemented methods. Section 6 concludes In this paper we propose Athena, which is an extension to the paper and proposes future work. the Hermes framework. Subsection 3.1 explains the hermes framework and how it contributes to the recommendati of news items. Subsection 3.2 explains how the user pro- 2. RELATED WORK file is constructed. In subsection 3.3 and subsection 3.4 we Recommending news items or other documents based on discuss some existing content-based respectively, semantic- the user's interest has attracted the attention of many re based recommendation nethods. In subsection 3.5 we in- searchers. Several adaptive Web-based news services have troduce the ranked recommendation method. our semantic. been developed which focus on personal recommendation of based recommendation method news items. These systems vary in application domain, plat- 3.1 Hermes form, development methodology, levels of adaptivity, etc We identify four categories in recommendation systems, con Athena is an extension to the Hermes framework 7,a tent-based, semantic-based, collaborative, and hybrid sys- framework used to build a news personalization service. The tems.In this paper, we limit the discussion to content-based system can be described by input, internal processing, and and semantic-based recommendation methods output. The input is composed of predefined RSS feeds of YourNews [1] is a personalized news system, that employs news items and concepts selected by the user. The inter a content-based approach, which intends to increase the nal processing is the classification of these news items using transparency of adapted news delivery by allowing the user concepts from a knowledge base. The output is defined as to adapt the user profile. Another content-based approach the personalized news items based on the selected concepts. is News Dude [2, which is a personal news recommending agent, that utilizes TF-IDF in combination with the Near- 3.1.1 The Ontology est Neighbor algorithm in order to recommend news items to The Hermes framework offers a semantic-based approach the user.[3] states, supported by Singhal's findings [12], that for retrieving news items related, directly or indirectly, to the performance of TF-IDF, which is employed in Your News the concepts of interests from a domain ontology, which and News Dude, decreases as the length of the article, and is called the knowledge base The ontology consists of he number of words, increases. In addition to this, by ig. classes, e.g, Company and CEO, and the relationship be- noring the semantics of a text, news items that are seman- tween these classes, like is cEoOf and its inverse hasCEO tically related to the news items in the user profile, fail to A concept is defined as either a class or an instance of a class be recommended by the system e.g., Company and ft. The knowledge base is con- [8] provides a practical approach to measure the related- structed and maintained by a domain expert, with financial ess or similarity between RSS news items. Their method is information obtained from Yahoo! Finance based on the semantic relatedness between rss items. As our approach, they determine the relationships between 3.1.2 The hermes news portal words, using WordNet[6. Their focus is on the linguis- The Hermes News Portal (HNP)is a Java implementation tic neighborhood of a word, in which general relationships of the Hermes framework 7. It allows the user to query as synonymy, hyponymy, and meronymy between words are the news and view the knowledge base. It uses Jena for considered. The difference with our approach is that we manipulating and reasoning with the OWL ontologies. For make use of an ontology. Besides the general relationships querying, it employs SPARQL and tSPARQL [7, which adds between words, the ontology covers specific relationships like time functionalities to the queries. The classification of the is-competitor-of, has-product, etc. Despite this difference, news articles is done using GATE [4] and the WordNet [6] heir method is applicable in our context, and therefore we will compare both approaches In [10 ontological user profiling is employed for recom- 3.2 User Profile construction mending academic research papers. While is-a relationships Recommending news items starts with building a user pro- tre rich in semantics, we find this approach limited, as it file. A user profile can be defined by keeping track of which fails to consider other types of concept relationships. The articles the user has read so far. Those articles will provide authors propose a classification algorithm, based on the k- us with information about the user's interests. The user pro- est Neighbor classifier, that assigns topics to papers. In file is constructed in different ways. For concept equivalence, approach, GatE 4 is employed to classify the content binary cosine, and Jaccard, the profile is a set of concepts article by using several language processing techniques. from the articles the user has read. The semantic related This enables the system to not only recommend full articles ness approach creates a vector with the distinct concepts but also possibly recommend a snippet of an article. An- from the user profile and assigns a weight to each concept other difference lies in the construction of the user profile The ranked recommendation method also vector s in [10, the user can adjust the profile. However, as [1 distinct concepts from the read articles and assigns a rank explains, adjusting the user profile might harm the quality to each concept. The difference in user profile construction of the recommendations, so in our approach the user is no between the latter two approaches, is the method used allowed to change the profile. Recommendations are made compute the corresponding weights. by combining collaborative filtering techniques with limited semantic-based recommendations, that only employ is-a re 3.3 Content-Based Recommendation lations, while our system solely employs semantic-based rec- A well-known term weighting method is TF-IDF(term ommendation techniques that utilize more types of relation- frequency-inverse document frequency)[11. A classic ap- ships between concepts proach in comparing documents is the use of TF-IdF to-of Athena as a plug-in for the HNP. Then, Sect. 5 gives the evaluation of the implemented methods. Section 6 concludes the paper and proposes future work. 2. RELATED WORK Recommending news items or other documents based on the user’s interest has attracted the attention of many re￾searchers. Several adaptive Web-based news services have been developed which focus on personal recommendation of news items. These systems vary in application domain, plat￾form, development methodology, levels of adaptivity, etc. We identify four categories in recommendation systems, con￾tent-based, semantic-based, collaborative, and hybrid sys￾tems. In this paper, we limit the discussion to content-based and semantic-based recommendation methods. YourNews [1] is a personalized news system, that employs a content-based approach, which intends to increase the transparency of adapted news delivery by allowing the user to adapt the user profile. Another content-based approach is News Dude [2], which is a personal news recommending agent, that utilizes TF-IDF in combination with the Near￾est Neighbor algorithm in order to recommend news items to the user. [3] states, supported by Singhal’s findings [12], that the performance of TF-IDF, which is employed in YourNews and NewsDude, decreases as the length of the article, and the number of words, increases. In addition to this, by ig￾noring the semantics of a text, news items that are seman￾tically related to the news items in the user profile, fail to be recommended by the system. [8] provides a practical approach to measure the related￾ness or similarity between RSS news items. Their method is based on the semantic relatedness between RSS items. As in our approach, they determine the relationships between words, using WordNet [6]. Their focus is on the linguis￾tic neighborhood of a word, in which general relationships as synonymy, hyponymy, and meronymy between words are considered. The difference with our approach is that we make use of an ontology. Besides the general relationships between words, the ontology covers specific relationships like is-competitor-of, has-product, etc. Despite this difference, their method is applicable in our context, and therefore we will compare both approaches. In [10] ontological user profiling is employed for recom￾mending academic research papers. While is-a relationships are rich in semantics, we find this approach limited, as it fails to consider other types of concept relationships. The authors propose a classification algorithm, based on the k￾Nearest Neighbor classifier, that assigns topics to papers. In our approach, GATE [4] is employed to classify the content of an article by using several language processing techniques. This enables the system to not only recommend full articles, but also possibly recommend a snippet of an article. An￾other difference lies in the construction of the user profile, as in [10], the user can adjust the profile. However, as [1] explains, adjusting the user profile might harm the quality of the recommendations, so in our approach the user is not allowed to change the profile. Recommendations are made by combining collaborative filtering techniques with limited semantic-based recommendations, that only employ is-a re￾lations, while our system solely employs semantic-based rec￾ommendation techniques that utilize more types of relation￾ships between concepts. 3. ATHENA In this paper we propose Athena, which is an extension to the Hermes framework. Subsection 3.1 explains the Hermes framework and how it contributes to the recommendation of news items. Subsection 3.2 explains how the user pro- file is constructed. In subsection 3.3 and subsection 3.4 we discuss some existing content-based respectively, semantic￾based recommendation methods. In subsection 3.5 we in￾troduce the ranked recommendation method, our semantic￾based recommendation method. 3.1 Hermes Athena is an extension to the Hermes framework [7], a framework used to build a news personalization service. The system can be described by input, internal processing, and output. The input is composed of predefined RSS feeds of news items and concepts selected by the user. The inter￾nal processing is the classification of these news items using concepts from a knowledge base. The output is defined as the personalized news items based on the selected concepts. 3.1.1 The Ontology The Hermes framework offers a semantic-based approach for retrieving news items related, directly or indirectly, to the concepts of interests from a domain ontology, which is called the knowledge base [7]. The ontology consists of classes, e.g., Company and CEO, and the relationship be￾tween these classes, like isCEOOf and its inverse hasCEO. A concept is defined as either a class or an instance of a class, e.g., Company and Microsoft. The knowledge base is con￾structed and maintained by a domain expert, with financial information obtained from Yahoo! Finance. 3.1.2 The Hermes News Portal The Hermes News Portal (HNP) is a Java implementation of the Hermes framework [7]. It allows the user to query the news and view the knowledge base. It uses Jena for manipulating and reasoning with the OWL ontologies. For querying, it employs SPARQL and tSPARQL [7], which adds time functionalities to the queries. The classification of the news articles is done using GATE [4] and the WordNet [6] semantic lexicon. 3.2 User Profile Construction Recommending news items starts with building a user pro- file. A user profile can be defined by keeping track of which articles the user has read so far. Those articles will provide us with information about the user’s interests. The user pro- file is constructed in different ways. For concept equivalence, binary cosine, and Jaccard, the profile is a set of concepts from the articles the user has read. The semantic related￾ness approach creates a vector with the distinct concepts from the user profile and assigns a weight to each concept. The ranked recommendation method also uses a vector of distinct concepts from the read articles and assigns a rank to each concept. The difference in user profile construction between the latter two approaches, is the method used to compute the corresponding weights. 3.3 Content-Based Recommendation A well-known term weighting method is TF-IDF (term frequency-inverse document frequency) [11]. A classic ap￾proach in comparing documents is the use of TF-IDF to-
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有