Chapter 19 Social Tagging recommender Systems Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme, Robert Jaschke, Andreas Hotho, Gerd Stumme and Panagiotis Symeonidis Abstract The new generation of Web applications known as(STS)is successfully established and poised for continued growth. STS are open and inherently social features that have been proven to encourage participation. But while STS bring new opportunities, they revive old problems, such as information overload. Rec- mender Systems are well known applications for increasing the level of relevant content over the"noise"that continuously grows as more and more content becomes available online. In StS however we face new challenges. Users are interested finding not only content, but also tags and even other users. Moreover, while tra- ditional recommender systems usually operate over 2-way data arrays, sTS data is represented as a third-order tensor or a hypergraph with hyperedges denoting(user, resource,tag)triples. In this chapter, we survey the most recent and state-of-the-art work about a whole new generation of recommender systems built to serve STS.We describe(a)novel facets of recommenders for STS, such as user, resource, and tag ecommenders, (b)new approaches and algorithms for dealing with the ternary na ure of STs data, and (c)recommender systems deployed in real world STS. More- over, a concise comparison between existing works is presented, through which we identify and point out new research directions Leandro Balby Marinho. Alexandros Nanopo Lars Schmidt-Thieme Information Systems and Machine Learning lab University of Hildesheim, Marien- burger Platz 22, 31141 Hildesheim, Germany, wwwismlluni-hildesheim. de. e-mail marinho, nanopoulos, schmidt-thieme@ismlLuni-hildesheim de Robert Jaschke. Andreas Hotho. Gerd Stumme Knowledge Data Engineering Group(KDE), University of Kassel, WilheImshoher Allee 73 34121Kassel,Germany,http://www.kde.cs.uni-kassel.de,e-mail:Gjaeschke,hotho,stumme@cs Panagiotis Symeonidis Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece, e-mail: symeon@ csd auth.gr F Ricci et al. (eds ) Recommender Systems Handbook, DOI 10 1007/978-0-387-85820-3 19, o Springer Science+Business Media, LLC 2011
Chapter 19 Social Tagging Recommender Systems Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme, Robert Jaschke, Andreas Hotho, Gerd Stumme and Panagiotis Symeonidis ¨ Abstract The new generation of Web applications known as (STS) is successfully established and poised for continued growth. STS are open and inherently social; features that have been proven to encourage participation. But while STS bring new opportunities, they revive old problems, such as information overload. Recommender Systems are well known applications for increasing the level of relevant content over the “noise” that continuously grows as more and more content becomes available online. In STS however, we face new challenges. Users are interested in finding not only content, but also tags and even other users. Moreover, while traditional recommender systems usually operate over 2-way data arrays, STS data is represented as a third-order tensor or a hypergraph with hyperedges denoting (user, resource, tag) triples. In this chapter, we survey the most recent and state-of-the-art work about a whole new generation of recommender systems built to serve STS. We describe (a) novel facets of recommenders for STS, such as user, resource, and tag recommenders, (b) new approaches and algorithms for dealing with the ternary nature of STS data, and (c) recommender systems deployed in real world STS. Moreover, a concise comparison between existing works is presented, through which we identify and point out new research directions. Leandro Balby Marinho · Alexandros Nanopoulos · Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Marienburger Platz 22, 31141 Hildesheim, Germany, http://www.ismll.uni-hildesheim.de, e-mail: {marinho,nanopoulos,schmidt-thieme}@ismll.uni-hildesheim.de Robert Jaschke ¨ · Andreas Hotho · Gerd Stumme Knowledge & Data Engineering Group (KDE), University of Kassel, Wilhelmshoher Allee 73, ¨ 34121 Kassel, Germany, http://www.kde.cs.uni-kassel.de, e-mail: {jaeschke,hotho,stumme}@cs. uni-kassel.de Panagiotis Symeonidis Department of Informatics, Aristotle University, 54124 Thessaloniki, Greece, e-mail: symeon@ csd.auth.gr F. Ricci et al. (eds.), Recommender Systems Handbook, DOI 10.1007/978-0-387-85820-3_19, © Springer Science+Business Media, LLC 2011 615
Leandro balby marinho et al 19.1 Introduction with the advent of affordable domestic high-speed communication facilities, in- expensive digitization devices, and the open access nature of the Web, a new and exciting family of Web applications known as Web 2.0 has been born. The underly ing idea is to decentralize and cheapen content creation, thus leading the Web into a nore open, connected, and democratic environment. In this chapter we will focus on a particular family of Web 2.0 applications known as Social Tagging Systems(STS for short ). STS assign a major role to the ordinary user, who is not only allowed to publish and edit resources, but also and more importantly, to create and share lightweight metadata in the form of freely chosen key words called tags. The expo- sure of users to both tags and resources creates a fundamental trigger for communi- cation and sharing, thus lowering the barriers to cooperation and contributing to the creation of collaborative lightweight knowledge structures known as folksonomies Some notable examples of STS are sites like Delicious, BibSonomy,, and Last.fm+, where Delicious allows the sharing of bookmarks, BibSonomy the sharing of book marks and lists of literature, and Last. fm the sharing of music. These systems are characterized by being easy to use and free to anyone willing to participate. Once a user is logged in, he can add a resource to the system, and assign arbitrary tags to it. If on the one hand this new family of applications brings new opportunities, it re- vives old problems on the other, namely the problem of information overload. Mil- lions of individual users and independent providers are flooding STS with content ind tags in an uncontrolled way, thereby lowering the potential for content retrieval and information sharing. One of the most successful approaches for increasing the level of relevant content over the"noise"that continuously grows as more and more content becomes available online lies on Recommender Systems(Rs for short). In STS however, we face several new challenges. Users are interested in finding not only but also tags, and even other users. Moreover, while traditional Rs usually operate over 2-way data arrays, folksonomy data is represented as a third- order tensor or a hypergraph with hyperedges denoting(user, resource, tag)triples. urthermore, while there is an extensive literature for rating prediction based on explicit user feedback, i.e., a numerical value denoting the degree of preference of a user for a given item, in folksonomies there are usually no ratings. Thus, before arguing why not to simply use an old solution to a recurrent problem, we need to investigate to which extent the traditional rs paradigm and approaches apply to STS Social tagging recommender systems is a young research area that has attracted significant attention recently, which is expressed by the increasing number of publi cations(e.g,[15, 11, 37, 35, 31))and is poised for continued growth. Furthermore, The term folksonomy refers to a blend of the two words folk and taxonomy, i. e, a collaborative classification system created and maintained by ordinary users. 3http://www.bibsonomy.org/ http://www.last.fml
616 Leandro Balby Marinho et al. 19.1 Introduction With the advent of affordable domestic high-speed communication facilities, inexpensive digitization devices, and the open access nature of the Web, a new and exciting family of Web applications known as Web 2.0 has been born. The underlying idea is to decentralize and cheapen content creation, thus leading the Web into a more open, connected, and democratic environment. In this chapter we will focus on a particular family of Web 2.0 applications known as Social Tagging Systems (STS for short). STS assign a major role to the ordinary user, who is not only allowed to publish and edit resources, but also and more importantly, to create and share lightweight metadata in the form of freely chosen keywords called tags. The exposure of users to both tags and resources creates a fundamental trigger for communication and sharing, thus lowering the barriers to cooperation and contributing to the creation of collaborative lightweight knowledge structures known as folksonomies1 . Some notable examples of STS are sites like Delicious2 , BibSonomy3 , and Last.fm4 , where Delicious allows the sharing of bookmarks, BibSonomy the sharing of bookmarks and lists of literature, and Last.fm the sharing of music. These systems are characterized by being easy to use and free to anyone willing to participate. Once a user is logged in, he can add a resource to the system, and assign arbitrary tags to it. If on the one hand this new family of applications brings new opportunities, it revives old problems on the other, namely the problem of information overload. Millions of individual users and independent providers are flooding STS with content and tags in an uncontrolled way, thereby lowering the potential for content retrieval and information sharing. One of the most successful approaches for increasing the level of relevant content over the “noise” that continuously grows as more and more content becomes available online lies on Recommender Systems (RS for short). In STS however, we face several new challenges. Users are interested in finding not only content, but also tags, and even other users. Moreover, while traditional RS usually operate over 2-way data arrays, folksonomy data is represented as a thirdorder tensor or a hypergraph with hyperedges denoting (user, resource, tag) triples. Furthermore, while there is an extensive literature for rating prediction based on explicit user feedback, i.e., a numerical value denoting the degree of preference of a user for a given item, in folksonomies there are usually no ratings. Thus, before arguing why not to simply use an old solution to a recurrent problem, we need to investigate to which extent the traditional RS paradigm and approaches apply to STS. Social tagging recommender systems is a young research area that has attracted significant attention recently, which is expressed by the increasing number of publications (e.g., [15, 11, 37, 35, 31]) and is poised for continued growth. Furthermore, 1 The term folksonomy refers to a blend of the two words folk and taxonomy, i.e., a collaborative classification system created and maintained by ordinary users. 2 http://delicious.com/ 3 http://www.bibsonomy.org/ 4 http://www.last.fm/
19 Social Tagging Recommender Systems real and large scale STS, such as Delicious, BibSonomy, and Last. fm, for example Iready offer some recommender services to their users, which implies an increas- ing commercial interest in the area. In this chapter we survey in a concise manner, the most recent and state-of-the-art work about a whole new generation of Rs built to serve StS. We describe: (a)novel facets of RS for STS, such as user, resource, and tag recommenders, (b) the challenges for deploying RS in real-world STS, (c) new approaches and algorithms for dealing with the inherent ternary relational data of folksonomies, and (d) approaches for tag acquisition. Emphasis is given on pre- senting a concise comparison between existing works, through which we identify and point out new research directions The chapter is structured as follows In Section 19.2 we characterize the data structure of folksonomies and point out some of the differences between the tra- ditional rs paradigm and social tagging RS In Section 19.3 we discuss the chal- lenges of deploying RS in real world STS and present the BibSonomy system as a study case. Section 19. 4 presents several families of social tagging RS, such as: graph/content-based algorithms for recommending users, resources or tags. Sec tion 19.5 provides comparisons and discussions about the algorithms presented in Section 19.4; and finally Section 19.6 closes the chapter pointing out new directions of research in this area 19.2 Social Tagging Recommenders Systems Folksonomies are the underlying structures of STS and result from the practice of collaboratively creating tags to annotate and categorize content. Tags, in general are a way of grouping content by category to make them easy to view by topic. This oach to organize a site and help users find sted in. Note that with the introduction of tags, the usual binary relation between users and resources, which is largely exploited by traditional RS, turns into a ternary between users, resources, and tags Since tags are voluntarily and freely provided by ordinary users, problems such as unwillingness to tag and diverging vocabulary can easily arise. As we will see in the course of this chapter, a possible way to address these problems is through tag RS. Tags also represent additional and personalized information about resources which if properly exploited, can eventually boost the performance of resource RS But before we delve into how rs can deal and benefit from the additional informa- tion provided by tags, we need to formally define folksonomies and its data struc orate on the differences between traditional rs and social tagging rs. and the challenges involved in deploying RS in real world STS: topics which are overed in the following section
19 Social Tagging Recommender Systems 617 real and large scale STS, such as Delicious, BibSonomy, and Last.fm, for example, already offer some recommender services to their users, which implies an increasing commercial interest in the area. In this chapter we survey in a concise manner, the most recent and state-of-the-art work about a whole new generation of RS built to serve STS. We describe: (a) novel facets of RS for STS, such as user, resource, and tag recommenders, (b) the challenges for deploying RS in real-world STS, (c) new approaches and algorithms for dealing with the inherent ternary relational data of folksonomies, and (d) approaches for tag acquisition. Emphasis is given on presenting a concise comparison between existing works, through which we identify and point out new research directions. The chapter is structured as follows. In Section 19.2 we characterize the data structure of folksonomies and point out some of the differences between the traditional RS paradigm and social tagging RS. In Section 19.3 we discuss the challenges of deploying RS in real world STS and present the BibSonomy system as a study case. Section 19.4 presents several families of social tagging RS, such as: graph/content-based algorithms for recommending users, resources or tags. Section 19.5 provides comparisons and discussions about the algorithms presented in Section 19.4; and finally Section 19.6 closes the chapter pointing out new directions of research in this area. 19.2 Social Tagging Recommenders Systems Folksonomies are the underlying structures of STS and result from the practice of collaboratively creating tags to annotate and categorize content. Tags, in general, are a way of grouping content by category to make them easy to view by topic. This is a grassroot approach to organize a site and help users find content they are interested in. Note that with the introduction of tags, the usual binary relation between users and resources, which is largely exploited by traditional RS, turns into a ternary relation between users, resources, and tags. Since tags are voluntarily and freely provided by ordinary users, problems such as unwillingness to tag and diverging vocabulary can easily arise. As we will see in the course of this chapter, a possible way to address these problems is through tag RS. Tags also represent additional and personalized information about resources, which if properly exploited, can eventually boost the performance of resource RS. But before we delve into how RS can deal and benefit from the additional information provided by tags, we need to formally define folksonomies and its data structures, elaborate on the differences between traditional RS and social tagging RS, and the challenges involved in deploying RS in real world STS; topics which are covered in the following sections
Leandro balby marinho et al 19.2.1 Folksonomy Formally, a folksonomy is a tuple F: =(U,T, R, r) where U, T, and R are non-empty finite sets, whose elements are called users, tags, and resources, resp . and Y is a ternary relation between them, i.e., YCUxTXR, whose elements are alled tag assignments. Users are typically described by their user ID, and tags may be arbitrary strings What is considered a resource depends on the type of system. For instance, in Deli- cious, the resources are URLs, in BibSonomy URLs or publication references, and in Last. fm, the resources can be artists, song tracks or albums Folksonomy data can be represented in different ways, and as we will see in Section 19.4, each representation can lead to different recommendation algorithms Folksonomies as Tensors The set of triples in Y can be represented as third-order tensors(3-dimensional arrays)A=(aul r) ERIUIXITIXIR. There are different ways to represent Y as a tensor(see left-hand sinde of Figure 19.1). Symeonidis et al. [35] for example, proposed to interpret Y as a sparse tensor in which I indicates positive feedback and 0 missing values ∫1,(at,r)∈Y 0. else Rendle et al. [26], on the other hand, distinguish between positive/negative ex amples and missing values in order to learn personalized rank tion 19.4). The idea is that positive and negative examples are only generated from observed tag assignments. Observed tag assignments are interpreted as positive feedback, whereas the non observed tag assignments of an already tagged resource are negative evidences. All other entries are assumed to be missing values(see right Note that in folksonomies, differently from typical RS, there are usually no nu- cating the explicit pre Folksonomies as Hypergraphs An equivalent, but maybe more intuitive repre entation of a folksonomy, is a tripartite(undirected)hypergraph: =(V, E), where V:= UUTUR is the set of no E: =u,t, rl(u,t, r)er is the set hyperedges(see Figure 19.2) In the original definition [12]. it is introduced additionally a subtag/supertag relation, which we omit here. The version used here is known in Formal Concept Analysis [7] as a triadic context [21 H
618 Leandro Balby Marinho et al. 19.2.1 Folksonomy Formally, a folksonomy is a tuple F := (U,T,R,Y) where • U, T, and R are non-empty finite sets, whose elements are called users, tags, and resources, resp., and • Y is a ternary relation between them, i. e., Y ⊆ U ×T ×R, whose elements are called tag assignments.5 Users are typically described by their user ID, and tags may be arbitrary strings. What is considered a resource depends on the type of system. For instance, in Delicious, the resources are URLs, in BibSonomy URLs or publication references, and in Last.fm, the resources can be artists, song tracks or albums. Folksonomy data can be represented in different ways, and as we will see in Section 19.4, each representation can lead to different recommendation algorithms. Folksonomies as Tensors The set of triples in Y can be represented as third-order tensors (3-dimensional arrays) A = (au,t,r) ∈ R |U|×|T|×|R| . There are different ways to represent Y as a tensor (see left-hand sinde of Figure 19.1). Symeonidis et al. [35], for example, proposed to interpret Y as a sparse tensor in which 1 indicates positive feedback and 0 missing values: au,t,r = { 1, (u,t,r) ∈ Y 0, else Rendle et al. [26], on the other hand, distinguish between positive/negative examples and missing values in order to learn personalized ranking of tags (see Section 19.4). The idea is that positive and negative examples are only generated from observed tag assignments. Observed tag assignments are interpreted as positive feedback, whereas the non observed tag assignments of an already tagged resource are negative evidences. All other entries are assumed to be missing values (see righthand side of Figure 19.1). Note that in folksonomies, differently from typical RS, there are usually no numerical ratings indicating the explicit preference of a user for a given resource/tag. Folksonomies as Hypergraphs An equivalent, but maybe more intuitive representation of a folksonomy, is a tripartite (undirected) hypergraph G := (V,E), where V := U∪˙ T∪˙ R is the set of nodes, and E := {{u,t,r} | (u,t,r) ∈ Y} is the set of hyperedges (see Figure 19.2). 5 In the original definition [12], it is introduced additionally a subtag/supertag relation, which we omit here. The version used here is known in Formal Concept Analysis [7] as a triadic context [21, 34]
19 Social Tagging Recommender Systems 000111 010 00|10 1110 :: 011 Fig. 19.1: Left [35]: 0/1 sparse tensor representation where positive feedback is interpreted as l and the remaining data as 0. Right [26]: Non observed tag assign- ments for a given already tagged resource are negative examples. All other entries ng values <Tag 3 Fig. 19.2: Tripartite undirected hypergraph representation of a folksonomy 19.2.2 The Traditional Recommender Systems Paradigm Recommender systems are software applications that aim at predicting the user interest for a particular resource based on a collection of user profiles, e. g, the user's history of purchase/resources'ratings, click-stream data, demographic infor mation,and so forth. Usually Rs predict ratings of resources or suggest a list of new resources that the user hopefully will like the most. Traditionally, for m users and n resources, the user profiles are represented in a sparse user-resource matrix XERXU, where denote missing values. The matrix can be decomposed Into row vectors where xu, r indicates that user u rated resource r by xu r E R. Each row vector xu cor responds thus to a user profile representing the resource's ratings of a particular user. This decomposition usually leads to algorithms that leverage user-user similarities uch as the well known user-based collaborative filtering(CF)[27]. The matrix can alternatively be represented by its column vectors
19 Social Tagging Recommender Systems 619 Fig. 19.1: Left [35]: 0/1 sparse tensor representation where positive feedback is interpreted as 1 and the remaining data as 0. Right [26]: Non observed tag assignments for a given already tagged resource are negative examples. All other entries are missing values. Fig. 19.2: Tripartite undirected hypergraph representation of a folksonomy. 19.2.2 The Traditional Recommender Systems Paradigm Recommender systems are software applications that aim at predicting the user interest for a particular resource based on a collection of user profiles, e.g., the user’s history of purchase/resources’ ratings, click-stream data, demographic information, and so forth. Usually RS predict ratings of resources or suggest a list of new resources that the user hopefully will like the most. Traditionally, for m users and n resources, the user profiles are represented in a sparse user-resource matrix X ∈ R m×n ∪ {.}, where {.} denote missing values. The matrix can be decomposed into row vectors: X := [x1,...,xm] T with xu := [xu,1,...,xu,n], for u := 1,...,m, where xu,r indicates that user u rated resource r by xu,r ∈ R. Each row vector xu corresponds thus to a user profile representing the resource’s ratings of a particular user. This decomposition usually leads to algorithms that leverage user-user similarities, such as the well known user-based collaborative filtering (CF) [27]. The matrix can alternatively be represented by its column vectors: X := [x1,...,xn] with xr := [x1,r ,...,xm,r ] T , for r := 1,...,n
Leandro balby marinho et al in which each column vector xr corresponds to a specific resource's ratings by all m users. This representation usually leverages item-item similarities and leads to item-based CF algorithms [3]. For a survey on neighborhood-based recommenda tion methods, such as CE, see Chapter 4 Note that because of the ternary relational nature of folksonomies, traditional RS cannot be applied directly. Therefore, in order to develop RS for folksonomies, one needs to either (i)reduce the ternary relation y to a lower dimensional space (usually second-order tensors) where traditional RS can be applied, or develop new algorithms that operate over third-order tensors or tripartite undirected hypergraph Note that if one follows (1), care must be taken during the dimensionality reduction since important information can be discarded, which can lower the overall accuracy of the recommendations In Section 19. 4 we present and discuss both families of algorithms 19.2.3 Multi-mode recommendations Differently from the traditional Rs paradigm, where one is usually concerned only ith rating prediction or resource recommendations, STS users may be interested in finding resources/tags, or even other users, and therefore recommendations can be provided for any of these entity types The recommendation of tags is used in several systems, like Delicious and Bib Sonomy, for example. It usually involves the recommendation of tags to users, based on the tags other users have provided for the same resources. Tag recommendations can expose different facets of an information item and relieve users from the obnox ious task of coming up with a good set of tags. Moreover, tag recommendation can ss of users to tag Figure 19.5 illustrates tag recommendations in Bibsonomy. It is important to note that differently from traditional RS, where there is usually twice, re-occurring tags are a common feature of sTs. a tag that has already been used to annotate a resource can be reused to annotate other different resources means that while traditional RS usually only recommend items that the user has not yet bought or rated, tag recommenders can eventually recommend tags that the user has already used for other resources The recommendation of resources is largely used in e-commerce and advertis- ing, like in Amazon for example. With the actual trend towards STS, the current resource recommendation services will also be able exploit the tags to boost the rec ommendation quality, for example, by recommending resources to users based the tags they have in common with other similar users. The movie recommendation website movielens, where users rate the movies they like and receive recommen- dations about other movies in which they might be interested, is a notable example http://www.movielens.org
620 Leandro Balby Marinho et al. in which each column vector xr corresponds to a specific resource’s ratings by all m users. This representation usually leverages item-item similarities and leads to item-based CF algorithms [3]. For a survey on neighborhood-based recommendation methods, such as CF, see Chapter 4. Note that because of the ternary relational nature of folksonomies, traditional RS cannot be applied directly. Therefore, in order to develop RS for folksonomies, one needs to either (i) reduce the ternary relation Y to a lower dimensional space (usually second-order tensors) where traditional RS can be applied, or develop new algorithms that operate over third-order tensors or tripartite undirected hypergraphs. Note that if one follows (i), care must be taken during the dimensionality reduction since important information can be discarded, which can lower the overall accuracy of the recommendations. In Section 19.4 we present and discuss both families of algorithms. 19.2.3 Multi-mode Recommendations Differently from the traditional RS paradigm, where one is usually concerned only with rating prediction or resource recommendations, STS users may be interested in finding resources/tags, or even other users, and therefore recommendations can be provided for any of these entity types. The recommendation of tags is used in several systems, like Delicious and BibSonomy, for example. It usually involves the recommendation of tags to users, based on the tags other users have provided for the same resources. Tag recommendations can expose different facets of an information item and relieve users from the obnoxious task of coming up with a good set of tags. Moreover, tag recommendation can reduce the problem of tag sparsity, which results from the unwillingness of users to tag. Figure 19.5 illustrates tag recommendations in BibSonomy. It is important to note that differently from traditional RS, where there is usually no repeat-buying, i.e., the user usually does not buy the same book, movie, CD, etc. twice, re-occurring tags are a common feature of STS. A tag that has already been used to annotate a resource can be reused to annotate other different resources. This means that while traditional RS usually only recommend items that the user has not yet bought or rated, tag recommenders can eventually recommend tags that the user has already used for other resources. The recommendation of resources is largely used in e-commerce and advertising, like in Amazon for example. With the actual trend towards STS, the current resource recommendation services will also be able exploit the tags to boost the recommendation quality, for example, by recommending resources to users based on the tags they have in common with other similar users. The movie recommendation website movielens6 , where users rate the movies they like and receive recommendations about other movies in which they might be interested, is a notable example. 6 http://www.movielens.org
19 Social Tagging Recommender Systems It started as a traditional recommender service operating over the typical user-rating binary matrix, and just recently added social tagging features, whereby new tag- aware algorithms are being developed and deployed [30] A third type of recommendation concerns recommending interesting users to a target user, which can help to connect people with common interests and encour- age them to contribute and share more content. with the term interesting users, we mean those users who have similar profile to the target user. If a set of tags is fre- quently used by many users, for example, then these users implicitly form a group of users with common interests, even though they may not have any physical or online connections. The tags represent the common interests to this user group Each mode of recommendation, i.e., tag, resource, or user, is useful, depending of course on the context of the particular application. Algorithms that are able to pro- vide integrated multi-mode recommendations are very appealing, as one can spare the effort of implementing and maintaining several mode-specific recommender sys- 19.3 Real World Social Tagging Recommender Systems 19.3.1 What are the challenges For a recommender system to be successful in a real world application, it must ap- proach several challenges. First, the provided recommendations must match the sit- uation,1.e,tags should describe the annotated resource, products should awake the interest of the user, suggested resources should be interesting and relevant. Second, he suggestions should be traceable such that one easily understands why he got the items suggested. Third, they must be delivered timely without delay and they must be easy to access (i.e, by allowing the user to click on them or to use tab-completion when entering tags). Furthermore, the system must ensure that recommendations do not impede the normal usage of the system. In this section we focus on tag recommendations as example of recommenders STS. Most STS contain a tag recommender which suggests tags to the user when she is annotating a resource Recommending tags can serve various purposes, such as: increasing the chances of getting a resource annotated, reminding a user what a resource is about, and consolidating the vocabulary across the users. Furthermore, as Sood et al. [33] point out, tag recommendations"fundamentally change the tagging rocess from generation to recognition"which requires less cognitive effort and time More formally, given a user u and a resource r, the task of a tag recommender to predict the tags tags(u, r) the user will assign to the resource. We will depict the(ordered! )set of recommended tags by f(u, r). Although we do not take the order of tags as the user entered them into account, the order of tags as given by the recommender plays an important role for the evaluation
19 Social Tagging Recommender Systems 621 It started as a traditional recommender service operating over the typical user-rating binary matrix, and just recently added social tagging features, whereby new tagaware algorithms are being developed and deployed [30]. A third type of recommendation concerns recommending interesting users to a target user, which can help to connect people with common interests and encourage them to contribute and share more content. With the term interesting users, we mean those users who have similar profile to the target user. If a set of tags is frequently used by many users, for example, then these users implicitly form a group of users with common interests, even though they may not have any physical or online connections. The tags represent the common interests to this user group. Each mode of recommendation, i.e., tag, resource, or user, is useful, depending of course on the context of the particular application. Algorithms that are able to provide integrated multi-mode recommendations are very appealing, as one can spare the effort of implementing and maintaining several mode-specific recommender systems. 19.3 Real World Social Tagging Recommender Systems 19.3.1 What are the Challenges? For a recommender system to be successful in a real world application, it must approach several challenges. First, the provided recommendations must match the situation, i.e., tags should describe the annotated resource, products should awake the interest of the user, suggested resources should be interesting and relevant. Second, the suggestions should be traceable such that one easily understands why he got the items suggested. Third, they must be delivered timely without delay and they must be easy to access (i.e., by allowing the user to click on them or to use tab-completion when entering tags). Furthermore, the system must ensure that recommendations do not impede the normal usage of the system. In this section we focus on tag recommendations as example of recommenders in STS. Most STS contain a tag recommender which suggests tags to the user when she is annotating a resource. Recommending tags can serve various purposes, such as: increasing the chances of getting a resource annotated, reminding a user what a resource is about, and consolidating the vocabulary across the users. Furthermore, as Sood et al. [33] point out, tag recommendations “fundamentally change the tagging process from generation to recognition” which requires less cognitive effort and time. More formally, given a user u and a resource r, the task of a tag recommender is to predict the tags tags(u,r) the user will assign to the resource. We will depict the (ordered!) set of recommended tags by Tˆ(u,r). Although we do not take the order of tags as the user entered them into account, the order of tags as given by the recommender plays an important role for the evaluation
Leandro balby marinho et al REST web services Good intro to the rEst"architecture to web service tutorial guidelines api rest by hotho and 3 other people on 2006-04-04 16: 11: 47 copy Fig. 19.3: detail showing a single bookmark post 19.3. 2 BibSonomy as Study Case 19.3.2.I System Description BibSonomy started as a students project at the Knowledge and Data Engineering Group of the University of Kassel' in spring 2005. The goal was to implement a system for organizing BIBTEX [25] entries in a way similar to bookmarks in De- licious- which was at that time becoming more and more popular. BIBTEX is a popular literature management system for lEX [20], which many researchers use for writing scientific papers. After integrating bookmarks as a second type of re- source into the system and upon the progress made, BibSonomy was opened for public access at the end of 2005-first announced to collegues only, later in 2006 to a detailed view of one bookmark post in BibSonomy can be seen in Figure 19.3. The first line shows in bold the title of the bookmark which has the url of the bookmark as underlying hyperlink. The second line shows an optional description the user can assign to every post. The last two lines belong together and show de- tailed information: first, all the tags the user has assigned to this post(web, service, tutorial, guidelines and api), second, the user name of that user(hotho) followed by note, how many users tagged that specific resource. These parts have underlying hyperlinkS, leading to the corresponding tag pages of the user, the users page and a page showing all four posts (i. e, the one of user hotho and those of the three other people)of this resource. The structure of a publication post is very similar, as seen in Figure 19.4. 19.3.2.2 Recommendations in BibSonomy To support the user during the tagging process and to facilitate the tagging, BibSon omy includes a tag recommender(see Figure 19.5). When a user finds an interesting web page(or publication) and posts it to BibSonomy, the system offers up to ten rec ommended tags on the posting page http://www.kde.cs.uni-kassel.de/
622 Leandro Balby Marinho et al. Fig. 19.3: detail showing a single bookmark post 19.3.2 BibSonomy as Study Case 19.3.2.1 System Description BibSonomy started as a students project at the Knowledge and Data Engineering Group of the University of Kassel7 in spring 2005. The goal was to implement a system for organizing BIBTEX [25] entries in a way similar to bookmarks in Delicious – which was at that time becoming more and more popular. BIBTEX is a popular literature management system for LATEX [20], which many researchers use for writing scientific papers. After integrating bookmarks as a second type of resource into the system and upon the progress made, BibSonomy was opened for public access at the end of 2005 – first announced to collegues only, later in 2006 to the public. A detailed view of one bookmark post in BibSonomy can be seen in Figure 19.3. The first line shows in bold the title of the bookmark which has the URL of the bookmark as underlying hyperlink. The second line shows an optional description the user can assign to every post. The last two lines belong together and show detailed information: first, all the tags the user has assigned to this post (web, service, tutorial, guidelines and api), second, the user name of that user (hotho) followed by a note, how many users tagged that specific resource. These parts have underlying hyperlinks, leading to the corresponding tag pages of the user, the users page and a page showing all four posts (i. e., the one of user hotho and those of the three other people) of this resource. The structure of a publication post is very similar, as seen in Figure 19.4. 19.3.2.2 Recommendations in BibSonomy To support the user during the tagging process and to facilitate the tagging, BibSonomy includes a tag recommender (see Figure 19.5). When a user finds an interesting web page (or publication) and posts it to BibSonomy, the system offers up to ten recommended tags on the posting page. 7 http://www.kde.cs.uni-kassel.de/
19 Social Tagging Recommender Systems Semantic Network Analysis of ontologies Bettina Hoser and Andreas Hotho and robert Jaschke and Christoph Schmitz and Gerd Stumme. Proceedings of the 3rd European Semantic Web Conference lemphflaccepted for publication)(2006) to web 2006 social ontology myown semantic analysis network sna by hotho and 1 other person on 2006-04-06 21: 32: 23 pick copy URL BibTeX Fig. 19.4: detail showing a single publication post Bibsonomy :: edit bookmark logged in as leschke help blog.ab rrtibsnnnmy·pa post bilt uit tans,settinslogout Feel free to edit your bookmark descriptor space suggested Fig. 19. 5: Tag recommendations in BibSonomy during annotation of a bookmark. 19.3.2.3 Technological and Infrastructure Requirements Implementing a recommendation service for BibSonomy required to tackle several problems, some of them we describe here First, having enough data available for recommendation algorithms to produce helpful recommendations is an important requirement one must address already in the design phase. The recommender needs access to the systems database and to what the user is currently posting(which could be accomplished, e.g., by(re)- ading recommendations using techniques like AJAX). Further data- like the full text of documents- could be supplied to tackle the cold-start problem(e.g, for content-based recommenders). The system must be able to handle large amounts of data, to quickly select relevant subsets and provide methods for preprocessing
19 Social Tagging Recommender Systems 623 Fig. 19.4: detail showing a single publication post Fig. 19.5: Tag recommendations in BibSonomy during annotation of a bookmark. 19.3.2.3 Technological and Infrastructure Requirements Implementing a recommendation service for BibSonomy required to tackle several problems, some of them we describe here. First, having enough data available for recommendation algorithms to produce helpful recommendations is an important requirement one must address already in the design phase. The recommender needs access to the systems database and to what the user is currently posting (which could be accomplished, e.g., by (re)- loading recommendations using techniques like AJAX). Further data – like the full text of documents – could be supplied to tackle the cold-start problem (e.g., for content-based recommenders). The system must be able to handle large amounts of data, to quickly select relevant subsets and provide methods for preprocessing
Leandro balby marinho et al The available hardware and expected amount of data limits the choice of recom- mendation algorithms which can be used. Although some methods allow(partial) precomputation of recommendations, this needs extra memory and might not yield the same good results as online computation. Both hardware and network infras- tructure must ensure short response times to deliver the recommendations to the user without too much delay. Together with a simple and non-intrusive user inter- face this ensures usability. Further aspects which should be taken into account include implementation of logging of user events(e.g, clicking, key presses, etc. )to allow for efficient eval uation of the used recommendation methods in an online setting Together with a live evaluation this also allows to tune the result selection strategies to dynam cally choose the(currently) best recommendation algorithm for the user or resource at hand. The multiplexing of several available algorithms together with the simple inclusion of external recommendation services(by providing an open recommenda- tion interface) is one of the recent developments in BibSonomy. 19.3.3 Tag acquisition The quality of tags can directly affect the recommendation performance of social tagging RS. Although folksonomies represent the"wisdom of crowds". social tag- ging can present problems, such as tag sparsity(users tend to provide a constrained iosyncrasy(tags used for personal organization like"to read", for example). All these problems can harm the quality of recommendations For this reason, we con- sider alternative ways of acquiring tags. This will help us to better characterize the following tag acquisition methods Expert Tagging: This approach usually relies on a small number of domain experts, who annotate resources using, mainly, structured vocabularies. Experts provide tags that are objective and cover multiple aspects. Pandora is a notable example of a system that uses experts for tagging music resources. The main advantage of using experts is the resulting well agreed tag vocabulary. This comes, of course, at the cost of manual work, which is both time consuming and expensive. Tagging based on annotation games: Games with a purpose(GWAP)[39] like the ESPGame, is a breakthrough idea to use a game to employ humans for the purpose of annotation. Two players observe simultaneously the same image and are asked to enter tags until they both enter the same tag. Following the success of ESPGame, several others appeared (e.g. Listen Game)in the 9http://www.gwap.com/gwap/gamespreview/
624 Leandro Balby Marinho et al. The available hardware and expected amount of data limits the choice of recommendation algorithms which can be used. Although some methods allow (partial) precomputation of recommendations, this needs extra memory and might not yield the same good results as online computation. Both hardware and network infrastructure must ensure short response times to deliver the recommendations to the user without too much delay. Together with a simple and non-intrusive user interface this ensures usability. Further aspects which should be taken into account include implementation of logging of user events (e.g., clicking, key presses, etc.) to allow for efficient evaluation of the used recommendation methods in an online setting. Together with a live evaluation this also allows to tune the result selection strategies to dynamically choose the (currently) best recommendation algorithm for the user or resource at hand. The multiplexing of several available algorithms together with the simple inclusion of external recommendation services (by providing an open recommendation interface) is one of the recent developments in BibSonomy. 19.3.3 Tag Acquisition The quality of tags can directly affect the recommendation performance of social tagging RS. Although folksonomies represent the “wisdom of crowds”, social tagging can present problems, such as tag sparsity (users tend to provide a constrained number of tags), polysemy (tags are subject to multiple interpretations), or tag idiosyncrasy (tags used for personal organization like “to read”, for example). All these problems can harm the quality of recommendations. For this reason, we consider alternative ways of acquiring tags. This will help us to better characterize the advantages and disadvantages of the social tagging process. We then examine the following tag acquisition methods: • Expert Tagging: This approach usually relies on a small number of domain experts, who annotate resources using, mainly, structured vocabularies. Experts provide tags that are objective and cover multiple aspects. Pandora8 is a notable example of a system that uses experts for tagging music resources. The main advantage of using experts is the resulting well agreed tag vocabulary. This comes, of course, at the cost of manual work, which is both time consuming and expensive. • Tagging based on annotation games: Games with a purpose (GWAP) [39], like the ESPGame9 , is a breakthrough idea to use a game to employ humans for the purpose of annotation. Two players observe simultaneously the same image and are asked to enter tags until they both enter the same tag. Following the success of ESPGame, several others appeared (e.g., ListenGame10) in the 8 http://www.pandora.com/ 9 http://www.gwap.com/gwap/gamesPreview/ 10 http://www.listengame.org/