正在加载图片...
Recommending Scientific Articles Using CiteULike Toine Bog Antal van den bosch Po.Box90153.5000LE PO.Box90153.5000LE Tilburg, The Netherlands Tilburg, The Netherlands A M Bogers @uvt. nl Antal vdn Bosch @uvt. nl ABSTRACT pecial functionality for certain academic resources, such as linkin We describe the use of the social reference management website to online versions of papers and special access to metadata specific CiteULike for recommending scientific articles to users, based on to academic resources their reference library. We test three different collaborative filter All four mentioned bibliographical reference managers encour- ing algorithms, and find that user-based filtering performs best. A age users to organize their references wi temporal analysis of the data indexed by CiteULike shows that it keywords. These in turn enable users to view all references, fron takes about two years for the cold-start problem to disappear and any user, associated with a chosen tag, as well as information about the popularity of a reference. This same linking is also applied to the author level so that users can browse other users who added ref- erences to publications written by a specific author. These features Categories and Subject Descriptors an help users to better cope with the information overload that is H 3 [Information Storage and Retrieval]: H 3.4 Systems and as overwhelming in the academic community as it is on the Web, Software: H 3.5 Online Information Services: H.3.7 Digital with an ever-increasing number of journals, books, and conference proceedings being published every year. This overload makes it hard to keep up with interesting new work, or to get a complete General Terms overview of relevant literature on specific topics For these features to be effective, active use of the online sys Algorithms, Me tem on the part of the user(searching, browsing) is needed. Our interest lies in using recommender systems to relieve this burden 1. INTRODUCTION and automatically find interesting and related reading material for One of the trends within the Web 2.0 paradigm is a shift in infor the user. A recommender system is a type of personalized infor- mation access from local and solitary, to global and collaborative. mation filtering technology used to identify a sets of items that are likely to be of interest to a certain user. One particular class of rec- Instead of storing, managing, and accessing personal information ommendation algorithms is collaborative filtering(CF), that base on only one specific computer or browser, personal information nanagement and access has been moving more and more to the recommendations on the opinions or actions of other like-minded Web. Social bookmarking websites are clear cases in point: in users. The motivation here is that a user will be more satisfied with stead of keeping a local copy of pointers to favorite URLS,users recommended items that are liked by like-minded users, than by can instead store and access their bookmarks online through a web items that are picked randomly or based on overall popularity. interface. The underlying application then makes all stored infor In this paper, we focus on using one of these social reference nation sharable among users, allowing for improved searching and managers, CiteULike, to generate reading lists for scientific arti- generating recommendations between users with similar interests. construction of a test collection based on the services offered by A special kind of social bookmarking services-and the focus of Cite uLike and apply three different CF algorithms to our data. We our paper-are social reference managers such as CiteULike, Con notea,Bibsonomy, and 2Collab'that aid users in managing also analyze the data across its temporal dimension: we use pub- eference collection. All of these services allow users to bo licly available activity logs to determine how recommendation per- any Web page or reference they choose, and in addition they formance changes as the website grows over time The paper is structured as follows. We discuss related work in AvaIlableathttp://www.citeulike.org,http:// Section 2. We discuss CiteULike. how our test collection was cre- www.connotea.org,http://www.bibsonomy.organd d, and what issues we ran into in greater detail in Section 3 http://www.2collab.comrespectively we describe our experimental setup and evaluation, followed by the results in Section 5. Section 6 contains the results of our temporal analysis of the different algorithms. We conclude in Section 7 and highlight possible future work. Permission to make digital or hard copies of all or part of this work for not made or distributed for profit or commercial advantage and that copie 2. RELATED WORK bear this notice and the full citation on the first page. To copy otherwise, to Most of the work related to recommending interesting informa- permisosh, to post on servers or to redistribute to lists, requires prior specific RecSys08. October 23-25. 2008. Lausanne Switzerland. creating information management agents. Maes(1997)was among Copyright2008ACM978-1-60558-093-7/08/10…S500 the first to signal the need for information filtering agents that canRecommending Scientific Articles Using CiteULike Toine Bogers ILK, Tilburg University P.O. Box 90153, 5000 LE Tilburg, The Netherlands A.M.Bogers@uvt.nl Antal van den Bosch ILK, Tilburg University P.O. Box 90153, 5000 LE Tilburg, The Netherlands Antal.vdnBosch@uvt.nl ABSTRACT We describe the use of the social reference management website CiteULike for recommending scientific articles to users, based on their reference library. We test three different collaborative filter￾ing algorithms, and find that user-based filtering performs best. A temporal analysis of the data indexed by CiteULike shows that it takes about two years for the cold-start problem to disappear and recommendation performance to improve. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.4 Systems and Software; H.3.5 Online Information Services; H.3.7 Digital Li￾braries General Terms Algorithms, Measurement, Performance, Experimentation 1. INTRODUCTION One of the trends within the Web 2.0 paradigm is a shift in infor￾mation access from local and solitary, to global and collaborative. Instead of storing, managing, and accessing personal information on only one specific computer or browser, personal information management and access has been moving more and more to the Web. Social bookmarking websites are clear cases in point: in￾stead of keeping a local copy of pointers to favorite URLs, users can instead store and access their bookmarks online through a Web interface. The underlying application then makes all stored infor￾mation sharable among users, allowing for improved searching and generating recommendations between users with similar interests. A special kind of social bookmarking services—and the focus of our paper—are social reference managers such as CiteULike, Con￾notea, Bibsonomy, and 2Collab1 that aid users in managing their reference collection. All of these services allow users to bookmark any Web page or reference they choose, and in addition they offer 1Available at http://www.citeulike.org, http:// www.connotea.org, http://www.bibsonomy.org, and http://www.2collab.com respectively. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. RecSys’08, October 23–25, 2008, Lausanne, Switzerland. Copyright 2008 ACM 978-1-60558-093-7/08/10 ...$5.00. special functionality for certain academic resources, such as linking to online versions of papers and special access to metadata specific to academic resources. All four mentioned bibliographical reference managers encour￾age users to organize their references with one or more tags, or keywords. These in turn enable users to view all references, from any user, associated with a chosen tag, as well as information about the popularity of a reference. This same linking is also applied to the author level, so that users can browse other users who added ref￾erences to publications written by a specific author. These features can help users to better cope with the information overload that is as overwhelming in the academic community as it is on the Web, with an ever-increasing number of journals, books, and conference proceedings being published every year. This overload makes it hard to keep up with interesting new work, or to get a complete overview of relevant literature on specific topics. For these features to be effective, active use of the online sys￾tem on the part of the user (searching, browsing) is needed. Our interest lies in using recommender systems to relieve this burden and automatically find interesting and related reading material for the user. A recommender system is a type of personalized infor￾mation filtering technology used to identify a sets of items that are likely to be of interest to a certain user. One particular class of rec￾ommendation algorithms is collaborative filtering (CF), that base recommendations on the opinions or actions of other like-minded users. The motivation here is that a user will be more satisfied with recommended items that are liked by like-minded users, than by items that are picked randomly or based on overall popularity. In this paper, we focus on using one of these social reference managers, CiteULike, to generate reading lists for scientific arti￾cles based on a user’s online reference library. We describe the construction of a test collection based on the services offered by CiteULike and apply three different CF algorithms to our data. We also analyze the data across its temporal dimension: we use pub￾licly available activity logs to determine how recommendation per￾formance changes as the website grows over time. The paper is structured as follows. We discuss related work in Section 2. We discuss CiteULike, how our test collection was cre￾ated, and what issues we ran into in greater detail in Section 3. In the following Section 4 we describe our experimental setup and evaluation, followed by the results in Section 5. Section 6 contains the results of our temporal analysis of the different algorithms. We conclude in Section 7 and highlight possible future work. 2. RELATED WORK Most of the work related to recommending interesting informa￾tion with respect to the user’s current interest or task has focused on creating information management agents. Maes (1997) was among the first to signal the need for information filtering agents that can
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有