a hybrid approach to item recommendation in folksonomies Robert wetzke Winfried umbrath Alan Said DAI Labor DAI Labor DAI Labor Technische Universitat Berlin Technische Universitat Berlin Technische Universitat Berlin robert, wetzker@dai winfried. umbrath @dai id@dai-laborde labor. de labor de ⊥ BSTRACT tion in large folksonomies. Our intent is to help folksonomy In this paper we consider the problem of item recom users discover new interesting items based on their item his. mendation in collaborative tagging communities, so called tory. To this mean, we exploit the semantic contribution folksonomies. where annotate interesting items with of tags and extend the classical collaborative filtering ags. Rather than following a collaborative filtering or proach by user-generated annotations. This allows us to im- annotation-based approach to recommendation, we extend prove recommendations by calculating item similarities not the probabilistic latent semantic analysis(PLSA)approach only based on the user item distribution, but also in the tag and present a unified recommendation model which evolves space. Our approach is thus algorithmically related to previ- from item user and item tag co-occurrences in parallel. The ously presented work on hybrid recommender systems tha inclusion of tags reduces known collaborative filtering prob- combine collaborative and content-based models for better lems related to overfitting and allows for higher quality rec- recommendation quality. However, instead of relying on the ommendations. Experimental results on a large snapshot often complex, hard to extract and possibly heterogeneous of the delicious bookmarking service show the scalability content of items, we only consider an items annotations of our approach and an improved recommendation quality compared to two-mode collaborative or annotation based The probabilistic latent semantic analysis(PLSA), as in troduced by Hofmann [9, has been shown to improve rec- ommendation quality for various settings by assuming a la Categories and Subject Descriptors tent lower dimensional topic model as origin of observed c H.3.3 Information Storage and Retrieval: Information occurrence distributions. Our approach extends the Plsa Search and Retrieval algorithm such that the topic model is estimated from the tem user as well as the item tag observations in parallel This allows us to benefit from user annotations during the Keywords recommender training and to combine collaborative and an folksonomies, tagging, recommendation, PLSA, delicious notation based models into a unified representation 1. INTRODUCTION The evaluation of our recommender system is performed or Collaborative tagging has become the most common con- large snapshot of 109 million bookmarks of the delicious tent categorization technique of the Web 2.0 age, allowing on-line bookmarking service The service allows its users the creators or consumers of content to assign freely chosen to centrally collect and manage their bookmarks by assign keywords(tags )in order to simplify later retrieval. The con- g tags. Being one of the first and most researched real pt of tagging has been proven successful in multiple areas world folksonomies, delicious represents a congruous eval- enabling the success of resource sharing services such as de uation object. Due to its large size, our dataset not only licious, last.fm or flickr. These social tagging communities reflects the structure and size of a real world social book- have become known as folksonomies. The distributed. user. marking corpus but also allows us to demonstrate the scala- centric annotation of (web-)content was shown to provide bility of our approach. Even though we limit our evaluation elevant meta-data and is expected to boost the semantic to social bookmarking, we believe our method is generally quality of labels applicable to the task of item recommendation in collabora- tive tagging communities. In this paper we consider the problem of item recommenda Historically, recommender systems are categorized into col- laborative filtering, content-based or hybrid systems, where the latter combine, or unify, user and content oriented a proaches and have shown to outperform their two-mode counterparts in many scenarios 3. Even though we do not consider the actual content of items rather item annotations generated by users, our scenario is algorithmically similar to
A hybrid approach to item recommendation in folksonomies Robert Wetzker DAI Labor Technische Universität Berlin robert.wetzker@dailabor.de Winfried Umbrath DAI Labor Technische Universität Berlin winfried.umbrath@dailabor.de Alan Said DAI Labor Technische Universität Berlin alan.said@dai-labor.de ABSTRACT In this paper we consider the problem of item recommendation in collaborative tagging communities, so called folksonomies, where users annotate interesting items with tags. Rather than following a collaborative filtering or annotation-based approach to recommendation, we extend the probabilistic latent semantic analysis (PLSA) approach and present a unified recommendation model which evolves from item user and item tag co-occurrences in parallel. The inclusion of tags reduces known collaborative filtering problems related to overfitting and allows for higher quality recommendations. Experimental results on a large snapshot of the delicious bookmarking service show the scalability of our approach and an improved recommendation quality compared to two-mode collaborative or annotation based methods. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval Keywords folksonomies, tagging, recommendation, PLSA, delicious 1. INTRODUCTION Collaborative tagging has become the most common content categorization technique of the Web 2.0 age, allowing the creators or consumers of content to assign freely chosen keywords (tags) in order to simplify later retrieval. The concept of tagging has been proven successful in multiple areas, enabling the success of resource sharing services such as delicious, last.fm or flickr. These social tagging communities have become known as folksonomies. The distributed, usercentric annotation of (web-) content was shown to provide relevant meta-data and is expected to boost the semantic quality of labels[7]. In this paper we consider the problem of item recommendation in large folksonomies. Our intent is to help folksonomy users discover new interesting items based on their item history. To this mean, we exploit the semantic contribution of tags and extend the classical collaborative filtering approach by user-generated annotations. This allows us to improve recommendations by calculating item similarities not only based on the user item distribution, but also in the tag space. Our approach is thus algorithmically related to previously presented work on hybrid recommender systems that combine collaborative and content-based models for better recommendation quality. However, instead of relying on the often complex, hard to extract and possibly heterogeneous content of items, we only consider an item’s annotations. The probabilistic latent semantic analysis (PLSA), as introduced by Hofmann [9], has been shown to improve recommendation quality for various settings by assuming a latent lower dimensional topic model as origin of observed cooccurrence distributions. Our approach extends the PLSA algorithm such that the topic model is estimated from the item user as well as the item tag observations in parallel. This allows us to benefit from user annotations during the recommender training and to combine collaborative and annotation based models into a unified representation. The evaluation of our recommender system is performed on a large snapshot of 109 million bookmarks of the delicious on-line bookmarking service1 . The service allows its users to centrally collect and manage their bookmarks by assigning tags. Being one of the first and most researched real world folksonomies, delicious represents a congruous evaluation object. Due to its large size, our dataset not only reflects the structure and size of a real world social bookmarking corpus but also allows us to demonstrate the scalability of our approach. Even though we limit our evaluation to social bookmarking, we believe our method is generally applicable to the task of item recommendation in collaborative tagging communities. 1.1 Related Work Historically, recommender systems are categorized into collaborative filtering, content-based or hybrid systems, where the latter combine, or unify, user and content oriented approaches and have shown to outperform their two-mode counterparts in many scenarios [3]. Even though we do not consider the actual content of items rather item annotations generated by users, our scenario is algorithmically similar to 1http://delicious.com
the fusion of collaborative and content-based models. The ming over all latent variables z. authors of 2 present a multi-dimensional technique which incorporates contextual information for an optimized recom- P(imU)=>P(imIzk)P(Khun) mender training. The fusion of co-occurrence relationships ong multiple types of objects is also proposed in [15 For the annotation-based scenario we assume the same hid- tent semantic analysis(LSA)algorithm that outperforms den topics as origin of the item tag co-occurrence observa- standard LSA in a variety of domains, such as collaborative tions given by IT. Analog to(1), the conditional probability filtering or text categorization. A more general overview of tween tags and items can be written as: recommender systems is given in 3, 6 P(im(tn)=>P(iml=k)P(=kltn Probabilistic latent semantic analysis(PLSA) has beer shown to improve the quality of collaborative filtering base recommenders 9 by assuming an underlying lower dimen- Following the procedure in 5, we can now combine both sional latent topic model. Similar to our approach, the au models based on the common factor P(imlzk)by maximizing thors of 5 consider the problem of document clustering and the log-likelihood function extend the Plsa algorithm to combine content-based and lyperlink-based similarities into a unified model. Model ft ∑f(m,u)ogP(nln) ion using PLSA was also successfully applied to the dis- overy of navigational patterns on the Web[12, in music recommendation combining multiple similarity measures 4 and for the cross-domain knowledge transfer [17] +(1-a)2/(im, tn)log P(imIt),(3) Until recently, research on recommender systems and folk where a is a predefined weight for the influence of each two- onomies mainly focused on tag recommendation [ 8, 11, 13 The authors of [14 enrich a collaborative movie recom mender by incorporating tags that were assigned to each Using the Expectation-Maximization(EM)algorithm 5 we movie in external folksonomies. Finally, [1 proposes to hen perform maximum likelihood parameter estimation for smooth tag item distributions based on usage patterns in the aspect model. During the expectation(E)step we first order to improve resource retrieval. calculate the posterior probabilitic The remainder of this paper is structured as follows. In P(z:Jul, in P(im zk)P(zUn) section 2 we extend the PlsA model to a recommendation P(imMun model which unifies annotations and usage patterns. We then present our experimental settings and the results ob- P(Enltn, im)=P(im Ek)P(=*ltm) P(imIt) tained from our experiments in sections 3 and 4 and summa- rize our conclusions and ideas for future directions in section and then re-estimate parameters in the maximization (M) step as follows: P(2ku)∝∑f(u,im)P(kuy,tn) 2. MODEL FUSION USING PLSA According to 10, a folksonomy can be described as a tri- P(ekIn)o>/(n, im)P(ekIn, im partite graph whose vertex set is partitioned into three dis- int sets of users U=ul,,u, tags T=(ti, .. tn) and p(im|zk)∝a∑f(u,imn)P(zku,in) ms I=i1,.,im. We simplify this model to two bi partite models where the collaborative filtering model IU is built from the item user co-occurrence counts f(i, u) and the +(1-a)>/(tn, im)P(ExItn, im)(6) annotation-based model it derives from the co-occurrence counts between items and tags f(i, t). In the case of social Based on the iterative computation of the above E and M bookmarking IU becomes a binary matrix ((i, u)E(0, 1), steps, the EM algorithm monotonically increases the likeli- as users can bookmark a given web resource only once hood of the combined model on the observed data. Using Given our model, we want to recommend the most inter- the a parameter, our new model can be easily reduced to a esting new items from I to a user u given the user's item collaborative filtering or annotation-based model by setting history. a to 1.0 or 0.0 respectively The aspect model of PLSA associates the co-occurrence of We can now recommend items to a user un weighted by the observations with a hidden topic variable Z=[21,.., 2k). probability P(imMun)from equation(1)2. For items already In the context of borative filtering an observation corre bookmarked by the user in the training data we set this sponds to the bookmarking of an item by a user and all ob- weight to 0, thus they are appended to the end of the rec- servations are given by the co-occurrence matrix IU. Users ommended item list and items are assumed independent given the topic variable Z. Applying the aspect model, the probability that an item It is also ole to recommend items with respect to a was bookmarked by a given user can be computed by su given tag tn based on equation(2)
the fusion of collaborative and content-based models. The authors of [2] present a multi-dimensional technique which incorporates contextual information for an optimized recommender training. The fusion of co-occurrence relationships among multiple types of objects is also proposed in [15], where the authors present a multi-type extension of the latent semantic analysis (LSA) algorithm that outperforms standard LSA in a variety of domains, such as collaborative filtering or text categorization. A more general overview of recommender systems is given in [3, 6]. Probabilistic latent semantic analysis (PLSA) has been shown to improve the quality of collaborative filtering based recommenders [9] by assuming an underlying lower dimensional latent topic model. Similar to our approach, the authors of [5] consider the problem of document clustering and extend the PLSA algorithm to combine content-based and hyperlink-based similarities into a unified model. Model fusion using PLSA was also successfully applied to the discovery of navigational patterns on the Web [12], in music recommendation combining multiple similarity measures [4] and for the cross-domain knowledge transfer [17]. Until recently, research on recommender systems and folksonomies mainly focused on tag recommendation [8, 11, 13]. The authors of [14] enrich a collaborative movie recommender by incorporating tags that were assigned to each movie in external folksonomies. Finally, [1] proposes to smooth tag item distributions based on usage patterns in order to improve resource retrieval. The remainder of this paper is structured as follows. In section 2 we extend the PLSA model to a recommendation model which unifies annotations and usage patterns. We then present our experimental settings and the results obtained from our experiments in sections 3 and 4 and summarize our conclusions and ideas for future directions in section 5. 2. MODEL FUSION USING PLSA According to [10], a folksonomy can be described as a tripartite graph whose vertex set is partitioned into three disjoint sets of users U = {u1, ..., ul}, tags T = {t1, ..., tn} and items I = {i1, ..., im}. We simplify this model to two bipartite models where the collaborative filtering model IU is built from the item user co-occurrence counts f(i, u) and the annotation-based model IT derives from the co-occurrence counts between items and tags f(i, t). In the case of social bookmarking IU becomes a binary matrix (f(i, u) ∈ {0, 1}), as users can bookmark a given web resource only once. Given our model, we want to recommend the most interesting new items from I to a user ul given the user’s item history. The aspect model of PLSA associates the co-occurrence of observations with a hidden topic variable Z = {z1, . . . , zk}. In the context of collaborative filtering an observation corresponds to the bookmarking of an item by a user and all observations are given by the co-occurrence matrix IU. Users and items are assumed independent given the topic variable Z. Applying the aspect model, the probability that an item was bookmarked by a given user can be computed by summing over all latent variables Z: P(im|ul) = X k P(im|zk)P(zk|ul), (1) For the annotation-based scenario we assume the same hidden topics as origin of the item tag co-occurrence observations given by IT. Analog to (1), the conditional probability between tags and items can be written as: P(im|tn) = X k P(im|zk)P(zk|tn). (2) Following the procedure in [5], we can now combine both models based on the common factor P(im|zk) by maximizing the log-likelihood function L = X m " α X l f(im, ul) log P(im|ul) +(1 − α) X n f(im, tn) log P(im|tn) # , (3) where α is a predefined weight for the influence of each twomode model. Using the Expectation-Maximization (EM) algorithm [5] we then perform maximum likelihood parameter estimation for the aspect model. During the expectation (E) step we first calculate the posterior probabilities: P(zk|ul, im) = P(im|zk)P(zk|ul) P(im|ul) P(zk|tn, im) = P(im|zk)P(zk|tn) P(im|tn) , and then re-estimate parameters in the maximization (M) step as follows: P(zk|ul) ∝ X m f(ul, im)P(zk|ul, im) (4) P(zk|tn) ∝ X m f(tn, im)P(zk|tn, im) (5) p(im|zk) ∝ α X l f(ul, im)P(zk|ul, im) +(1 − α) X n f(tn, im)P(zk|tn, im) (6) Based on the iterative computation of the above E and M steps, the EM algorithm monotonically increases the likelihood of the combined model on the observed data. Using the α parameter, our new model can be easily reduced to a collaborative filtering or annotation-based model by setting α to 1.0 or 0.0 respectively. We can now recommend items to a user ul weighted by the probability P(im|ul) from equation (1)2 . For items already bookmarked by the user in the training data we set this weight to 0, thus they are appended to the end of the recommended item list. 2 It is also possible to recommend items with respect to a given tag tn based on equation (2)
3. EXPERIMENTS 3.1 Dataset We evaluate our approach on a corpus of originally 142 million bookmarks from the delicious bookmarking service These bookmarks were collected between September 19 2007 and January 22, 2008. This is the same corpus as described in [16. A previous analysis unveiled that the orig. inal corpus was highly polluted by spam [16. In order to get eaningful results, we limit the impact of spam users on the initial corpus as their anomalous behavior would strongly i terfere with our analysis. To identify spam users we employ a common spam usage pattern. As was shown in [16),many spam users try to heighten the visibility of their web do- mains and consequently post a very high number of URLs PLSA(=0.0) to very few domains. To reduce the spam ratio within the 一PLsA(a=1.0) data, we excluded the top 10 percent of users with the high est URLs per domain rate from our analysis. The filtered data set consists of 109 million bookmarks 1 3.2 Experimental setup Our experiments are performed on a 6 month section, July Figure 1: Magnified roC curves for the item recom December 2007, of the spam filtered corpus. We remove mendation task on the delicious dataset. The num- items, users and tags occurring less than 10 times within ber of latent topics(k)is set to 80 for the annotation- hese 6 months, thus generating the p-core 10 of the initial based PLSA recommender(a=0.0) and to 5 for the ripartite graph. We then split the remaining data into 6 collaborative version(a= 1.0). The MP line rep- ion bookmarks corresponding to more than 5.6 million tag classifier. pe monthly snapshots, each containing approximately 1.6 mil- resents the performance of a most-popular baseline assignments. For each month, the numbers of elements in ach dimension, I, T, U, roughly sum up to 200, 000, 95, 000 and 200, 000 respectively. The corresponding co-occurrence account user preferences. However, as for the Plsa recom- matrices I and IT are very sparse and only contain a per- mender, we set the weight of previously bookmarked item centage of around 0.004 and 0.012 non-zero entries. All re to 0. Most-popular recommenders have become a standard sults presented in this paper are averaged over all 6 months feature of Web 2.0 resource sharing communities. For each month we randomly select 80% of all bookmarks for 4. RESULTS raining and the remaining bookmarks are saved for testing. Figure l presents a section of the ROC ct This split is done on a per user basis. The bookmarks from rative filtering(a= 1.0)and the annotat a=0.0) he training period are then used to create the co-occurrence PLSA recommenders with the number of latent topics k set to 5 and 80 respectively. All values are averaged over the After training we select a random set of 1000 users, with at 6 evaluation months. The figure shows a significant boost least 10 test items each. For every user we recommend al items sorted by P(immun) where items bookmarked by the PLSA recommender(a=0)reaching AUC values of 0.9022 user during the training or before the evaluated month are compared to 0. 8425 for the most-popular recommender. For eighted with P(imU)=0. The quality of the recom- the collaborative method(a= 1)with an optimal k set to mended item list is evaluated using performance measures 5 we obtain an AUc result only slightly above the baseline commonly found in relevant literature (6), such as the area performance(0.8467). However, the collaborative recom der curve(AUC) value of the receiver operating charac- mender performs better for small numbers of recommended teristic(ROC)curves or the precision measure. Results are items averaged over all test users Multiple variables have to be taken into consideration when Table 1: Area under curve(AUC) for different pa- valuating recommender systems. Among these is the ques rameter settings. Bold entries indicate the best AUC value for a given number of latent topics k. tion whether items that do not appear in the training data should be included into the evaluation. As we are only terested in the relative improvement of our approach, we 0.2 0.84160.84910.8936 090090.9023 enove all previously unseen items. For the same reasons, 0 we also exclude items which appear in the training but not O.80.843808180 8461 all obtained results are compared with the performance of a baseline recommender (most-popular) that weights items y how often they were bookmarked during the training pe- Table 1 compares the resulting AUC values for the Plsa riod. These item weights are global and do not take into recommender and different choices of a and k. Once again
3. EXPERIMENTS 3.1 Dataset We evaluate our approach on a corpus of originally 142 million bookmarks from the delicious bookmarking service. These bookmarks were collected between September 19, 2007 and January 22, 2008. This is the same corpus as described in [16]. A previous analysis unveiled that the original corpus was highly polluted by spam [16]. In order to get meaningful results, we limit the impact of spam users on the initial corpus as their anomalous behavior would strongly interfere with our analysis. To identify spam users we employ a common spam usage pattern. As was shown in [16], many spam users try to heighten the visibility of their web domains and consequently post a very high number of URLs to very few domains. To reduce the spam ratio within the data, we excluded the top 10 percent of users with the highest URLs per domain rate from our analysis. The filtered data set consists of 109 million bookmarks. 3.2 Experimental setup Our experiments are performed on a 6 month section, July– December 2007, of the spam filtered corpus. We remove items, users and tags occurring less than 10 times within these 6 months, thus generating the p-core 10 of the initial tripartite graph. We then split the remaining data into 6 monthly snapshots, each containing approximately 1.6 million bookmarks corresponding to more than 5.6 million tag assignments. For each month, the numbers of elements in each dimension, I, T, U, roughly sum up to 200, 000, 95, 000 and 200, 000 respectively. The corresponding co-occurrence matrices IU and IT are very sparse and only contain a percentage of around 0.004 and 0.012 non-zero entries. All results presented in this paper are averaged over all 6 months. For each month we randomly select 80% of all bookmarks for training and the remaining bookmarks are saved for testing. This split is done on a per user basis. The bookmarks from the training period are then used to create the co-occurrence matrices IU and IT on which the recommenders are trained. After training we select a random set of 1000 users, with at least 10 test items each. For every user we recommend all items sorted by P(im|ul) where items bookmarked by the user during the training or before the evaluated month are weighted with P(im|ul) = 0. The quality of the recommended item list is evaluated using performance measures commonly found in relevant literature [6], such as the area under curve (AUC) value of the receiver operating characteristic (ROC) curves or the precision measure. Results are averaged over all test users. Multiple variables have to be taken into consideration when evaluating recommender systems. Among these is the question whether items that do not appear in the training data should be included into the evaluation. As we are only interested in the relative improvement of our approach, we remove all previously unseen items. For the same reasons, we also exclude items which appear in the training but not in the test data. All obtained results are compared with the performance of a baseline recommender (most-popular) that weights items by how often they were bookmarked during the training period. These item weights are global and do not take into 0 0.1 0.2 0.3 0.4 0.5 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 false positive rate true positive rate PLSA (α=0.0) PLSA (α=1.0) MP Figure 1: Magnified ROC curves for the item recommendation task on the delicious dataset. The number of latent topics (k) is set to 80 for the annotationbased PLSA recommender (α = 0.0) and to 5 for the collaborative version (α = 1.0). The MP line represents the performance of a most-popular baseline classifier. account user preferences. However, as for the PLSA recommender, we set the weight of previously bookmarked items to 0. Most-popular recommenders have become a standard feature of Web 2.0 resource sharing communities. 4. RESULTS Figure 1 presents a section of the ROC curves for the collaborative filtering (α = 1.0) and the annotation-based (α = 0.0) PLSA recommenders with the number of latent topics k set to 5 and 80 respectively. All values are averaged over the 6 evaluation months. The figure shows a significant boost in recommendation quality when using an annotation-based PLSA recommender (α = 0) reaching AUC values of 0.9022 compared to 0.8425 for the most-popular recommender. For the collaborative method (α = 1) with an optimal k set to 5 we obtain an AUC result only slightly above the baseline performance (0.8467). However, the collaborative recommender performs better for small numbers of recommended items. Table 1: Area under curve (AUC) for different parameter settings. Bold entries indicate the best AUC value for a given number of latent topics k. α/k 1 5 10 20 40 80 0.0 0.8402 0.8736 0.8877 0.8936 0.9004 0.9022 0.2 0.8416 0.8491 0.8936 0.8975 0.9009 0.9023 0.4 0.8430 0.8419 0.8944 0.8986 0.8954 0.8935 0.6 0.8437 0.8423 0.8720 0.8916 0.8848 0.8722 0.8 0.8438 0.8418 0.8727 0.8678 0.8461 0.8178 1.0 0.8435 0.8467 0.8348 0.8110 0.7766 0.7466 Table 1 compares the resulting AUC values for the PLSA recommender and different choices of α and k. Once again
△PLsA-00-×-PLsA02-PLsA-04-PsA-o0-米PsA-00-号PsA-10---M 0.015 0.014 0.013 80012 0.009 0.008 Figure 2: Prec@100 for the item recommendation task on the delicious dataset. The figure shows the effect of different values of a and k on the performance of a hybrid PlSA recommender. The MP line represents the performance of a most-popular baseline classifier we find that for an increasing number of latent topics k the 5. CONCLUSIONS collaborative filtering recommender behaves very differently In this paper we have shown that a hybrid approach to from the annotation-based recommender. While the auC he task of item recommendation in folksonomies that in- values increase with k when a=0, quality decreases in the cludes user generated annotations produces better results collaborative filtering scenario. We believe this is due to the than a standard collaborative filtering or annotation-based fact that there exist more item tag than item user relations method. We presented an extension to the Plsa algorithm which in turn may lead to faster overfitting in the sparser in order to combine usage and tagging information into er item mode unified model. The evaluation of our approach, which was performed on a large scale corpus of a real world folksor The results also indicate that recommendation quality can omy, showed that the presented recommender not only out- be improved using model fusion independent of the number performs two-mode methods but, because of its low dimen- of latent topics, although this effect lessens with higher k val sional data representation, also decreases recommendation ues. This observation is backed by the results plotted in Fig time. For future work, we plan to extend our investiga ure 2. The figure shows the precision when recommending tions to tensor-based models that fully reflect the tri-partite he 100 items with the highest value P(immun)(Prec@100 nature of collaborative tagging systems and were shown to We find that few latent topics cannot cope with the complex- improve recommendation quality in other settings [13 ity of the unified model, and the two-mode recommenders perform better for k values below 10. However, once the number of latent classes is able to fit the model, we see a 6. REFERENCES firm improvement in recommendation quality. Furthermore 1 Rabeeh Abbasi and Steffen Staab, "Introducing triple we observe that for more than 10 latent classes our hybrid play for improved resource retrieval in collaborative recommenders constantly outperform their two-mode coun tagging systems', in In: Proc. of EcIr o8 Workshop terparts. We also find, that models with a high a value tend on Exploiting Semantic Annotations in Information o overfit earlier than more annotation oriented models. We Retrieval (ESAIR 2008 ),(3 2008) believe that this tendency is caused by the denser nature of [2] Gediminas Adomavicius, Ramesh Sankaranarayanan the item tag graph, although this has to be further investi- Shahana Sen, and Alexander Tuzhilin, ' Incorporating gated. The most interesting observation of our evaluations contextual information in recommender systems using is the overall bad performance of the collaborative filtering multidimensional approach, ACM Trans. Inf. Syst recommender, and the drastically increased precision when 3(1),103-145,(2005 onsidering annotations during the model building process 3 Gediminas Adomavicius and Alexander Tuzhilin Consistent with the aUC results we find that any Plsa rec- ommender with an appropriate k performs better than the Toward the next generation of recommender systems A survey of the state-of-the-art and possible xtensions',IEEE Trans. KnowL. Data Eng, 17(6) 734-749.(2005)
0 10 20 30 40 50 60 70 80 0.008 0.009 0.01 0.011 0.012 0.013 0.014 0.015 k Prec@100 PLSAα=0.0 PLSAα=0.2 PLSAα=0.4 PLSAα=0.6 PLSAα=0.8 PLSAα=1.0 MP Figure 2: Prec@100 for the item recommendation task on the delicious dataset. The figure shows the effect of different values of α and k on the performance of a hybrid PLSA recommender. The MP line represents the performance of a most-popular baseline classifier. we find that for an increasing number of latent topics k the collaborative filtering recommender behaves very differently from the annotation-based recommender. While the AUC values increase with k when α = 0, quality decreases in the collaborative filtering scenario. We believe this is due to the fact that there exist more item tag than item user relations which in turn may lead to faster overfitting in the sparser user item model. The results also indicate that recommendation quality can be improved using model fusion independent of the number of latent topics, although this effect lessens with higher k values. This observation is backed by the results plotted in Figure 2. The figure shows the precision when recommending the 100 items with the highest value P(im|ul) (Prec@100). We find that few latent topics cannot cope with the complexity of the unified model, and the two-mode recommenders perform better for k values below 10. However, once the number of latent classes is able to fit the model, we see a firm improvement in recommendation quality. Furthermore, we observe that for more than 10 latent classes our hybrid recommenders constantly outperform their two-mode counterparts. We also find, that models with a high α value tend to overfit earlier than more annotation oriented models. We believe that this tendency is caused by the denser nature of the item tag graph, although this has to be further investigated. The most interesting observation of our evaluations is the overall bad performance of the collaborative filtering recommender, and the drastically increased precision when considering annotations during the model building process. Consistent with the AUC results we find that any PLSA recommender with an appropriate k performs better than the most-popular baseline recommender. 5. CONCLUSIONS In this paper we have shown that a hybrid approach to the task of item recommendation in folksonomies that includes user generated annotations produces better results than a standard collaborative filtering or annotation-based method. We presented an extension to the PLSA algorithm in order to combine usage and tagging information into a unified model. The evaluation of our approach, which was performed on a large scale corpus of a real world folksonomy, showed that the presented recommender not only outperforms two-mode methods but, because of its low dimensional data representation, also decreases recommendation time. For future work, we plan to extend our investigations to tensor-based models that fully reflect the tri-partite nature of collaborative tagging systems and were shown to improve recommendation quality in other settings [13]. 6. REFERENCES [1] Rabeeh Abbasi and Steffen Staab, ‘Introducing triple play for improved resource retrieval in collaborative tagging systems’, in In: Proc. of ECIR’08 Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2008), (3 2008). [2] Gediminas Adomavicius, Ramesh Sankaranarayanan, Shahana Sen, and Alexander Tuzhilin, ‘Incorporating contextual information in recommender systems using a multidimensional approach’, ACM Trans. Inf. Syst., 23(1), 103–145, (2005). [3] Gediminas Adomavicius and Alexander Tuzhilin, ‘Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions.’, IEEE Trans. Knowl. Data Eng., 17(6), 734–749, (2005)
[4 J. Arenas-Garcia, A. Meng, K. B. Petersen, T. SA,(200).ACM. Schioler, L. K. Hansen, and J. Larsen, 'Unveiling [16 Robert Wetzker, Carsten Zimmermann, and Christian music structure via PlSa similarity fusion, in IEEE Bauckhage, ' Analyzing social bookmarking systems: A International Workshop on Machine Learning fo del icio us cookbook, in Mining Social Data(MSoDa) gnal Processing, pp 419-424. IEEE Press,(aug Workshop Proceedings, pp. 26-30. ECAI 2008, (July 2008 5 David A Cohn and Thomas Hofmann, "The missing [17 Gui-Rong Xue, Wenyuan Dai, Qiang Yang, and Yong link- a probabilistic model of document content and Yu, Topic-bridged plsa for cross-domain text hypertext connectivity', in NIPS, eds, Todd K. Leen classification, in SIGIR 08: Proceedings of the 31st nas G. Dietterich, and Volker Tresp, pp. 430-436 annual international ACM SIGIR conference on Research and development in information retrieval, [6 Jonathan L Herlocker, Joseph A. Konstan, Loren pp 627-634, New York, NY, USA,(2008). ACM Terveen, and John T. Riedl, "Evaluating collaborative filtering recommender systems', ACM Trans. Inf Syst.,22(1),5-53,(2004) 7 Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina, 'Can social bookmarking improve web earch?', in WSDM 08: Proc. of the int. conf. on Web search and web data mining, pp. 195-206, New York, NY, USA,(2008). ACM [8 Paul Heymann, Daniel Ramage, and Hector Garcia-Molina, Social tag prediction, in SIGIR '08 Proc. of the 31st ann. int. ACM SIGIR conf. on Research and development in information retrieval, pp. 531-538, New York, NY, USA,(2008). ACM. 9 Thomas Hofmann, ' Probabilistic latent semantic analysis, in Proc. of Uncertainty in Artificial Intelligence, UAI99, (1999) [10 Andreas Hotho, Robert Jaschke, Christoph Schmitz and Gerd Stumme, "Information retrieval in folksonomies: Search and ranking,, in ESwC, eds York Sure and John Domingue. volume 4011 of Lecture Notes in Computer Science, pp. 411-426 Springer,(2006) 11 Robert Jaschke, Leandro Marinho, Andreas Hotho Lars Schmidt -Thieme, and Gerd Stumme, Tag recommendations in folksonomies, in workshop Proceedings of Lernen. Wissensentdeckung Adaptivitat (LWA 2007), ed, Alexander Hinneburg pp.13-20,(sep2007) 12 Xin Jin, Yanzan Zhou, and Bamshad Mobasher, ' Web sage mining based on probabilistic latent semantic analysis', in KDD, eds, Won Kim, Ron Kohavi ohannes Gehrke, and william DuMouchel, pp 197-205.ACM,(2004) 13 Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos, Tag recommendations based on tensor dimensionality reduction, in RecSys 08: Proc. of the 2008 ACM conf. on Recommender systems, pp 43-50, New York, NY, USA,(2008). ACM 14 Martin Szomszor, Ciro Cattuto, Harith Alani, Kieron O'Hara. Andrea Baldassarri. Vittorio Loreto. and vito D. P Servedio, ' Folksonomies, the semantic web, and movie recommendation, in Bridging the Gap between Semantic Web and Web 2.0(Sern Net 2007), pp 71-84,(2007) 15 Xuanhui Wang, Jian-Tao Sun, Zheng Chen, and Cheng Xiang Zhai, "Latent semantic analysis for multiple-type interrelated data objects, in SIGIR 06: roceedings of the 29th annual international ACM IGIR conference on Research and development in nformation retrieval, pp. 236-243, New York, NY
[4] J. Arenas-Garc´ıa, A. Meng, K. B. Petersen, T. L. Schiøler, L. K. Hansen, and J. Larsen, ‘Unveiling music structure via PLSA similarity fusion’, in IEEE International Workshop on Machine Learning for Signal Processing, pp. 419–424. IEEE Press, (aug 2007). [5] David A. Cohn and Thomas Hofmann, ‘The missing link - a probabilistic model of document content and hypertext connectivity’, in NIPS, eds., Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, pp. 430–436. MIT Press, (2000). [6] Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl, ‘Evaluating collaborative filtering recommender systems’, ACM Trans. Inf. Syst., 22(1), 5–53, (2004). [7] Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina, ‘Can social bookmarking improve web search?’, in WSDM ’08: Proc. of the int. conf. on Web search and web data mining, pp. 195–206, New York, NY, USA, (2008). ACM. [8] Paul Heymann, Daniel Ramage, and Hector Garcia-Molina, ‘Social tag prediction’, in SIGIR ’08: Proc. of the 31st ann. int. ACM SIGIR conf. on Research and development in information retrieval, pp. 531–538, New York, NY, USA, (2008). ACM. [9] Thomas Hofmann, ‘Probabilistic latent semantic analysis’, in Proc. of Uncertainty in Artificial Intelligence, UAI’99, (1999). [10] Andreas Hotho, Robert J¨aschke, Christoph Schmitz, and Gerd Stumme, ‘Information retrieval in folksonomies: Search and ranking’, in ESWC, eds., York Sure and John Domingue, volume 4011 of Lecture Notes in Computer Science, pp. 411–426. Springer, (2006). [11] Robert J¨aschke, Leandro Marinho, Andreas Hotho, Lars Schmidt-Thieme, and Gerd Stumme, ‘Tag recommendations in folksonomies’, in Workshop Proceedings of Lernen - Wissensentdeckung - Adaptivit¨at (LWA 2007), ed., Alexander Hinneburg, pp. 13–20, (sep 2007). [12] Xin Jin, Yanzan Zhou, and Bamshad Mobasher, ‘Web usage mining based on probabilistic latent semantic analysis.’, in KDD, eds., Won Kim, Ron Kohavi, Johannes Gehrke, and William DuMouchel, pp. 197–205. ACM, (2004). [13] Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos, ‘Tag recommendations based on tensor dimensionality reduction’, in RecSys ’08: Proc. of the 2008 ACM conf. on Recommender systems, pp. 43–50, New York, NY, USA, (2008). ACM. [14] Martin Szomszor, Ciro Cattuto, Harith Alani, Kieron O’Hara, Andrea Baldassarri, Vittorio Loreto, and Vito D. P. Servedio, ‘Folksonomies, the semantic web, and movie recommendation’, in Bridging the Gap between Semantic Web and Web 2.0 (SemNet 2007), pp. 71–84, (2007). [15] Xuanhui Wang, Jian-Tao Sun, Zheng Chen, and ChengXiang Zhai, ‘Latent semantic analysis for multiple-type interrelated data objects’, in SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 236–243, New York, NY, USA, (2006). ACM. [16] Robert Wetzker, Carsten Zimmermann, and Christian Bauckhage, ‘Analyzing social bookmarking systems: A del.icio.us cookbook’, in Mining Social Data (MSoDa) Workshop Proceedings, pp. 26–30. ECAI 2008, (July 2008). [17] Gui-Rong Xue, Wenyuan Dai, Qiang Yang, and Yong Yu, ‘Topic-bridged plsa for cross-domain text classification’, in SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 627–634, New York, NY, USA, (2008). ACM