正在加载图片...
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING.VOL.27.NO.5.MAY 2015 1343 Relational Collaborative Topic Regression for Recommender Systems Hao Wang and Wu-Jun Li,Member,IEEE Abstract-Due to its successful application in recommender systems,collaborative filtering(CF)has become a hot research topic in data mining and information retrieval.In traditional CF methods,only the feedback matrix,which contains either explicit feedback (also called ratings)or implicit feedback on the items given by users,is used for training and prediction.Typically,the feedback matrix is sparse,which means that most users interact with few items.Due to this sparsity problem,traditional CF with only feedback information will suffer from unsatisfactory performance.Recently,many researchers have proposed to utilize auxiliary information,such as item content(attributes),to alleviate the data sparsity problem in CF.Collaborative topic regression(CTR)is one of these methods which has achieved promising performance by successfully integrating both feedback information and item content information.In many real applications,besides the feedback and item content information,there may exist relations (also known as networks)among the items which can be helpful for recommendation.In this paper,we develop a novel hierarchical Bayesian model called Relational Collaborative Topic Regression(RCTR),which extends CTR by seamlessly integrating the user-item feedback information,item content information,and network structure among items into the same model.Experiments on real-world datasets show that our model can achieve better prediction accuracy than the state-of-the-art methods with lower empirical training time.Moreover,RCTR can learn good interpretable latent structures which are useful for recommendation. Index Terms-Collaborative filtering,topic models,recommender system,social network,relational learning INTRODUCTION ECOMMENDER systems (RS)play an important role to users,is used for training and prediction.Typically,the Lenable us to make effective use of information.For feedback matrix is sparse,which means that most items example,Amazon [28]adopts RS for product recommenda-are given feedback by few users or most users only give tions,and Netflix [8]uses RS for movie recommendations. feedback to few items.Due to this sparsity problem,tradi- Existing RS methods can be roughly categorized into three tional CF with only feedback information will suffer from classes [1],[4],[10],[41]:content-based methods,collabora- unsatisfactory performance.More specifically,it is difficult tive filtering (CF)based methods,and hybrid methods. for CF methods to achieve good performance in both Content-based methods [5],[26]adopt the profile of the item-oriented setting and user-oriented setting when the users or products for recommendation.CF based methods feedback matrix is sparse.In an item-oriented setting where [7],[11],[13],[18],[27],[301,[331,[351,[37],[39],[42]use we need to recommend users to items,it is generally diffi- past activities or preferences,such as the ratings on items cult to know which users could like an item if it has only given by users,for prediction,without using any user or been given feedback by one or two users.This adds to the product profiles.Hybrid methods [2],[31,[61,[121,[151,[161,difficulty companies face when promoting new products [17],[201,[321,[34],[40],[431,[47],[48]combine both con-(items).Moreover,users'ignorance of new items will result tent-based methods and CF based methods by ensemble in less feedback on the new items,which will further harm techniques.Due to privacy issues,it is harder in general to the accuracy of their recommendations.For the user- collect user profiles than past activities.Hence,CF based oriented setting where we recommend items to users,it is methods have become more popular than content-based also difficult to predict what a user likes if the user has only methods in recent years. given feedback to one or two items.However,in the real In most traditional CF methods,only the feedback world,it is common to find that most users provide only a matrix,which contains either explicit feedback(also called little feedback.Actually,providing good recommendations ratings)or implicit feedback [21]on the items given by for new users with little feedback is more important than for frequent users since new users will only come back to the site(service)depending on how good the recommenda- .H.Wang is with the Department of Computer Science and Engineering, tion is.However,for frequent users,it is most likely that Hong Kong University of Science and Technology,Hong Kong. they are already satisfied with the site(service).If we man- E-mail:lwangaz@cse.ust.hk. .W.-J.Li is with the National Key Laboratory of Novel Software Technol- age to boost the recommendation accuracy for new or infre- ogy,Department of Computer Science and Technology,Nanjing Univer- quent users,more of them will become frequent users,and sity,Nanjing 210023,China.E-mail:liwujun@nju.edu.cn. then better recommendations can be expected with more Manuscript received 24 July 2013;revised 29 Aug.2014;accepted 6 Oct. training data.Therefore,improving the recommendation 2014.Date of publication 29 Oct.2014;date of current version 27 Mar.2015. accuracy at an extremely sparse setting is key to getting the Recommended for acceptance by Y.Koren. recommender systems working in a positive cycle. For information on obtaining reprints of this article,please send e-mail to: reprints@ieee.org,and reference the Digital Object Identifier below. To overcome the sparsity problem of CF-based models, Digital Object Identifier no.10.1109/TKDE.2014.2365789 many researchers have proposed to integrate auxiliary mmission standardsRelational Collaborative Topic Regression for Recommender Systems Hao Wang and Wu-Jun Li, Member, IEEE Abstract—Due to its successful application in recommender systems, collaborative filtering (CF) has become a hot research topic in data mining and information retrieval. In traditional CF methods, only the feedback matrix, which contains either explicit feedback (also called ratings) or implicit feedback on the items given by users, is used for training and prediction. Typically, the feedback matrix is sparse, which means that most users interact with few items. Due to this sparsity problem, traditional CF with only feedback information will suffer from unsatisfactory performance. Recently, many researchers have proposed to utilize auxiliary information, such as item content (attributes), to alleviate the data sparsity problem in CF. Collaborative topic regression (CTR) is one of these methods which has achieved promising performance by successfully integrating both feedback information and item content information. In many real applications, besides the feedback and item content information, there may exist relations (also known as networks) among the items which can be helpful for recommendation. In this paper, we develop a novel hierarchical Bayesian model called Relational Collaborative Topic Regression (RCTR), which extends CTR by seamlessly integrating the user-item feedback information, item content information, and network structure among items into the same model. Experiments on real-world datasets show that our model can achieve better prediction accuracy than the state-of-the-art methods with lower empirical training time. Moreover, RCTR can learn good interpretable latent structures which are useful for recommendation. Index Terms—Collaborative filtering, topic models, recommender system, social network, relational learning Ç 1 INTRODUCTION RECOMMENDER systems (RS) play an important role to enable us to make effective use of information. For example, Amazon [28] adopts RS for product recommenda￾tions, and Netflix [8] uses RS for movie recommendations. Existing RS methods can be roughly categorized into three classes [1], [4], [10], [41]: content-based methods, collabora￾tive filtering (CF) based methods, and hybrid methods. Content-based methods [5], [26] adopt the profile of the users or products for recommendation. CF based methods [7], [11], [13], [18], [27], [30], [33], [35], [37], [39], [42] use past activities or preferences, such as the ratings on items given by users, for prediction, without using any user or product profiles. Hybrid methods [2], [3], [6], [12], [15], [16], [17], [20], [32], [34], [40], [43], [47], [48] combine both con￾tent-based methods and CF based methods by ensemble techniques. Due to privacy issues, it is harder in general to collect user profiles than past activities. Hence, CF based methods have become more popular than content-based methods in recent years. In most traditional CF methods, only the feedback matrix, which contains either explicit feedback (also called ratings) or implicit feedback [21] on the items given by users, is used for training and prediction. Typically, the feedback matrix is sparse, which means that most items are given feedback by few users or most users only give feedback to few items. Due to this sparsity problem, tradi￾tional CF with only feedback information will suffer from unsatisfactory performance. More specifically, it is difficult for CF methods to achieve good performance in both item-oriented setting and user-oriented setting when the feedback matrix is sparse. In an item-oriented setting where we need to recommend users to items, it is generally diffi- cult to know which users could like an item if it has only been given feedback by one or two users. This adds to the difficulty companies face when promoting new products (items). Moreover, users’ ignorance of new items will result in less feedback on the new items, which will further harm the accuracy of their recommendations. For the user￾oriented setting where we recommend items to users, it is also difficult to predict what a user likes if the user has only given feedback to one or two items. However, in the real world, it is common to find that most users provide only a little feedback. Actually, providing good recommendations for new users with little feedback is more important than for frequent users since new users will only come back to the site (service) depending on how good the recommenda￾tion is. However, for frequent users, it is most likely that they are already satisfied with the site (service). If we man￾age to boost the recommendation accuracy for new or infre￾quent users, more of them will become frequent users, and then better recommendations can be expected with more training data. Therefore, improving the recommendation accuracy at an extremely sparse setting is key to getting the recommender systems working in a positive cycle. To overcome the sparsity problem of CF-based models, many researchers have proposed to integrate auxiliary  H. Wang is with the Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong. E-mail: hwangaz@cse.ust.hk.  W.-J. Li is with the National Key Laboratory of Novel Software Technol￾ogy, Department of Computer Science and Technology, Nanjing Univer￾sity, Nanjing 210023, China. E-mail: liwujun@nju.edu.cn. Manuscript received 24 July 2013; revised 29 Aug. 2014; accepted 6 Oct. 2014. Date of publication 29 Oct. 2014; date of current version 27 Mar. 2015. Recommended for acceptance by Y. Koren. For information on obtaining reprints of this article, please send e-mail to: reprints@ieee.org, and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TKDE.2014.2365789 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 5, MAY 2015 1343 1041-4347  2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有