STaR: a Social Tag Recommender System Cataldo Musto. Fedelucio Narducci Marco de gemmis Pasquale Lops, and Giovanni Semeraro Department of Computer Science, University of Bari "Aldo Moro", Italy imusto, narducci, degemmis, lops, semeraroj@di uniba.it Abstract. The continuous growth of collaborative platforms we are re- cently witnessing made possible the passage from an elitary' Web, writ- ten by few and read by many, towards the so-called Web 2.0, a more user-centric'vision, where users become active contributors in Web dy- namics. In this context, collaborative tagging systems are rapidly emerg- ing: in these platforms users can annotate resources they like with freel chosen keyword(called tags) in order to make retrieval of information and serendipitous browsing more and more easier. However, as tags are andled in a simply syntactical way, collaborative tagging systems suffer of typical Information Retrieval(IR)problems like polysemy and onymy: so, in order to reduce the impact of these drawbacks and to aid at the same time the so-called tag convergence, systems that assist the user in the task of tagging are required. The goal of these systems(called tag recommenders)is to suggest a set of relevant key words for the re- ources to be annotated by exploiting different approaches In this paper we present a tag recommender developed for the ECML-PKDD 2009 Discovery Challenge Our approach is based on two assumptions: firstly if two or more resources share some common patterns(e.g. the same fea- tures in the textual description), we can exploit this information suppos- that they could be annotated with similar tags. Furthermore, since each user has a typical manner to label resources, a tag recommender might exploit this information to weigh more the tags she already used Key words: Recommender Systems, Web 2.0, Collaborative Tagging Systems, Folksonomic 1 Introduction The coming of Web 2.0 has changed the role of Internet users and the shape of services offered by the World wide Web. Since web sites tend to be more interac- tive and user-centric than in the past, users are shifting from passive consumers of information to active produce Web 2.0 applications, users are ab to easily publish content such as photos, videos, political opinions, reviews, so they are identified as Web prosumers: producers consumers of knowledge One of the forms of user-generated content (UGC) that has drawn more at- tention from the research community is tagging, which is the act of annotating
STaR: a Social Tag Recommender System Cataldo Musto, Fedelucio Narducci, Marco de Gemmis, Pasquale Lops, and Giovanni Semeraro Department of Computer Science, University of Bari “Aldo Moro”, Italy {musto,narducci,degemmis,lops,semeraro}@di.uniba.it Abstract. The continuous growth of collaborative platforms we are recently witnessing made possible the passage from an ‘elitary’ Web, written by few and read by many, towards the so-called Web 2.0, a more ‘user-centric’ vision, where users become active contributors in Web dynamics. In this context, collaborative tagging systems are rapidly emerging: in these platforms users can annotate resources they like with freely chosen keyword (called tags) in order to make retrieval of information and serendipitous browsing more and more easier. However, as tags are handled in a simply syntactical way, collaborative tagging systems suffer of typical Information Retrieval (IR) problems like polysemy and synonymy: so, in order to reduce the impact of these drawbacks and to aid at the same time the so-called tag convergence, systems that assist the user in the task of tagging are required. The goal of these systems (called tag recommenders) is to suggest a set of relevant keywords for the resources to be annotated by exploiting different approaches. In this paper we present a tag recommender developed for the ECML-PKDD 2009 Discovery Challenge. Our approach is based on two assumptions: firstly, if two or more resources share some common patterns (e.g. the same features in the textual description), we can exploit this information supposing that they could be annotated with similar tags. Furthermore, since each user has a typical manner to label resources, a tag recommender might exploit this information to weigh more the tags she already used to annotate similar resources. Key words: Recommender Systems, Web 2.0, Collaborative Tagging Systems, Folksonomies 1 Introduction The coming of Web 2.0 has changed the role of Internet users and the shape of services offered by the World Wide Web. Since web sites tend to be more interactive and user-centric than in the past, users are shifting from passive consumers of information to active producers. By using Web 2.0 applications, users are able to easily publish content such as photos, videos, political opinions, reviews, so they are identified as Web prosumers: producers + consumers of knowledge. One of the forms of user-generated content (UGC) that has drawn more attention from the research community is tagging, which is the act of annotating
resources of interests with free keywords, called tags, in order to help users in organizing, browsing and searching resources through the building of a socially onstructed classification schema, called folksonomy [18. In contrast to systems here information about resources is only provided by a small set of experts collaborative tagging systems take into account the way individuals conceive the information contained in a resource [ 19. Well-known example of platforms that embed tagging activity are Flickr to share photos, YouTube to share videos Del iciousto share bookmarks, Last. m to share music listening habits and Bibsonomy to share bookmarks and lists of literature. Although these systems provide heterogeneous contents, they have a common core: once a user is logged in, she can post a new resource and choose some significant keywords to identify it. Besides, users can label resources previously posted from other users. This phenomenon represents a very important opportunity to categorize the resources on the web, otherwise hardly feasible. The act of tagging resources from different users is the social aspect of this activity; in this way tags create a connection among users and items. Users that label the same resource by using the same tags could have similar tastes and items labeled with the same tags could have common characteristics Many would argue that the power of tagging lies in the ability for people to freely determine the appropriate tags for a resource without having to rely on a predefined lexicon or hierarchy [11]. Indeed, folksonomies are fully free and reflect the user mind, but they suffer of the same problems of unchecked vocabulary. Golder et. al. 5 identified three major problems with current tagging systems polysemy, synonymy, and level variation. Polysemy refers to situations where tags can have multiple meanings: for example a resource tagged with the term turkey could indicate a news taken from an online newspaper about politics or a recipe for Thanksgiving Day. When multiple tags share a single meaning we refer to it as synonymy. In collaborative tagging systems we can have simple morphological variations(for example we can find 'blog', ',,web log, to identify a common blog) but also semantic similarity(like resources tagged with arts'versus'cultural heritage ) The third problem, called level variations, refers to the phenomenon of tagging at different level of abstraction. Some people can annotate a web page containing a recipe for roast turkey with the tag roast turkeybut also with a simple recipe In order to avoid these problems, in the last years many tools have been developed to facilitate the user in the task of tagging and to aid the tag con- vergence [4: these systems are know as tag recommenders. When a user posts a resource in a Web 2.0 platform, a tag recommender suggests some significant keywords to label the item following some criteria to filter out the noise from the complete tag space http://www.flickr.com 2http://www.youtube.com http://delicious.com/ http://www.last.fm/ http://www.bibsonomy.or
resources of interests with free keywords, called tags, in order to help users in organizing, browsing and searching resources through the building of a sociallyconstructed classification schema, called folksonomy [18]. In contrast to systems where information about resources is only provided by a small set of experts, collaborative tagging systems take into account the way individuals conceive the information contained in a resource [19]. Well-known example of platforms that embed tagging activity are Flickr1 to share photos, YouTube2 to share videos, Del.icio.us3 to share bookmarks, Last.fm4 to share music listening habits and Bibsonomy5 to share bookmarks and lists of literature. Although these systems provide heterogeneous contents, they have a common core: once a user is logged in, she can post a new resource and choose some significant keywords to identify it. Besides, users can label resources previously posted from other users. This phenomenon represents a very important opportunity to categorize the resources on the web, otherwise hardly feasible. The act of tagging resources from different users is the social aspect of this activity; in this way tags create a connection among users and items. Users that label the same resource by using the same tags could have similar tastes and items labeled with the same tags could have common characteristics. Many would argue that the power of tagging lies in the ability for people to freely determine the appropriate tags for a resource without having to rely on a predefined lexicon or hierarchy [11]. Indeed, folksonomies are fully free and reflect the user mind, but they suffer of the same problems of unchecked vocabulary. Golder et. al. [5] identified three major problems with current tagging systems: polysemy, synonymy, and level variation. Polysemy refers to situations where tags can have multiple meanings: for example a resource tagged with the term turkey could indicate a news taken from an online newspaper about politics or a recipe for Thanksgiving’ Day. When multiple tags share a single meaning we refer to it as synonymy. In collaborative tagging systems we can have simple morphological variations (for example we can find ‘blog’, ‘blogs’, ‘web log’, to identify a common blog) but also semantic similarity (like resources tagged with ‘arts’ versus ‘cultural heritage’). The third problem, called level variations, refers to the phenomenon of tagging at different level of abstraction. Some people can annotate a web page containing a recipe for roast turkey with the tag ‘roastturkey’ but also with a simple ‘recipe’. In order to avoid these problems, in the last years many tools have been developed to facilitate the user in the task of tagging and to aid the tag convergence [4]: these systems are know as tag recommenders. When a user posts a resource in a Web 2.0 platform, a tag recommender suggests some significant keywords to label the item following some criteria to filter out the noise from the complete tag space. 1 http://www.flickr.com 2 http://www.youtube.com 3 http://delicious.com/ 4 http://www.last.fm/ 5 http://www.bibsonomy.org/
This paper presents STaR(Social Tag Recommender system), a tag recom- mender system developed for the ECML-PKDD 2009 Discovery Challenge. The idea behind our work is that folksonomies create connections among users and items, so we tried to point out two concepts: Resources with similar content could be annotated with similar tags a tag recommender needs to take into account the previous tagging activity of users, by weighting more tags already used to annotate similar resources In this work we identify two main aspects in the tag recommendation task firstly, each user has a typical manner to label resources (for example using personal tags such as beautiful, ugly,,pleasant, etc. which are not connecte to the content of the item, or simply tagging using general tags like 'politic sport, etc. ); next, similar resources usually share common tags: when a user posts a resource r on the platform, our system takes into account how she(if she is already stored in the system)and the entire community previously tagged resources similar to r in order to suggest relevant tags. Next, we develop this model and we tested it on a dataset extracted from BibSonomy. The paper is organized as follows. Section 2 analyzes related work. The gen- eral problem of tag recommendation is introduced in Section 3. Section 4 explains the architecture of the system and how the recommendation approach is imple- mented. The experimental section carried out is described in Section 5.1, while conclusions and future works are drawn in last section 2 Related Work Previous work in the tag recommendation area can be broadly divided into three classes: content-based, collaborative and graph-based approaches. In the content-based approach, a system exploits some textual source with Information Retrieval-related techniques 1 in order to extract relevant unigrams or bigrams from the text. Brooks et. al 3, for example, develop a tag recom- mender system that automatically suggests tags for a blog post extracting the top three terms exploiting TF/IDF scoring [14. The system presented by Lee nd Chun 8 recommends tags retrieved from the content of a blog using artificial neural networks The network is trained based on statistical information about word frequencies and lexical information about word semantics extracted from WordNet. The collaborative approach for tag recommendation, instead, presents by Mishne and implemented in Auto Tag [12], the system suggests tags basedo some analogies with collaborative filtering methods 2. In the model proposed the other tags associated with similar posts in a given collection. The recommen- dation process is performed in three steps: first, the tool finds similar posts and extracts their tags. All the tags are then merged, building a general folksonomy that is filtered and reranked. The top-ranked tags are suggested to the user, who selects the most appropriate ones to attach to the post. Tag Assist [16 improves the AutoTags'approach performing a lossless compression over existing tag data It finds similar blog posts and suggests a subset of the associated tag through a
This paper presents STaR (Social Tag Recommender system), a tag recommender system developed for the ECML-PKDD 2009 Discovery Challenge. The idea behind our work is that folksonomies create connections among users and items, so we tried to point out two concepts: – Resources with similar content could be annotated with similar tags; – A tag recommender needs to take into account the previous tagging activity of users, by weighting more tags already used to annotate similar resources. In this work we identify two main aspects in the tag recommendation task: firstly, each user has a typical manner to label resources (for example using personal tags such as ‘beautiful’, ‘ugly’, ‘pleasant’, etc. which are not connected to the content of the item, or simply tagging using general tags like ‘politics’, ‘sport’, etc.); next, similar resources usually share common tags: when a user posts a resource r on the platform, our system takes into account how she (if she is already stored in the system) and the entire community previously tagged resources similar to r in order to suggest relevant tags. Next, we develop this model and we tested it on a dataset extracted from BibSonomy. The paper is organized as follows. Section 2 analyzes related work. The general problem of tag recommendation is introduced in Section 3. Section 4 explains the architecture of the system and how the recommendation approach is implemented. The experimental section carried out is described in Section 5.1, while conclusions and future works are drawn in last section. 2 Related Work Previous work in the tag recommendation area can be broadly divided into three classes: content-based, collaborative and graph-based approaches. In the content-based approach, a system exploits some textual source with Information Retrieval-related techniques [1] in order to extract relevant unigrams or bigrams from the text. Brooks et. al [3], for example, develop a tag recommender system that automatically suggests tags for a blog post extracting the top three terms exploiting TF/IDF scoring [14]. The system presented by Lee and Chun [8] recommends tags retrieved from the content of a blog using artificial neural networks. The network is trained based on statistical information about word frequencies and lexical information about word semantics extracted from WordNet. The collaborative approach for tag recommendation, instead, presents some analogies with collaborative filtering methods [2]. In the model proposed by Mishne and implemented in AutoTag [12], the system suggests tags based on the other tags associated with similar posts in a given collection. The recommendation process is performed in three steps: first, the tool finds similar posts and extracts their tags. All the tags are then merged, building a general folksonomy that is filtered and reranked. The top-ranked tags are suggested to the user, who selects the most appropriate ones to attach to the post. TagAssist [16] improves the AutoTags’ approach performing a lossless compression over existing tag data. It finds similar blog posts and suggests a subset of the associated tag through a
Tag Suggestion Engine(TSE)which leverages previously tagged posts providing appropriate suggestions for new content. In [10 the tag recommendations task is performed through a user-based collaborative filtering approach. The method seems to produce good results when applied on the user-tag matrix, so they show hat users with a similar tag vocabulary tend to tag alike The problem of tag recommendation through graph-based approaches has been firstly addressed by Jaschke et al in 7. They compared some recommendation techniques including collaborative filtering, PageRank and FolkRank. The key idea behind Folk Rank algorithm is that a resource which is tagged by important tags from impor tant users becomes important itself. The same concept holds for tags and users, thus the approach uses a graph whose vertices mutually reinforce themselves by spreading their weights. The evaluation showed that Folk Rank outperforms Schmitz et al. [15 proposed association rule 1 mining as a nique that might be useful in the tag recommendation process. In literature we can find also some hybrid methods integrating two or more approaches(mainly content and collaborative ones) in order to reduce their typical drawbacks an point out their qualities. Heymann et. al [6 present a tag recommender that ex- ploits at the same time social knowledge and textual sources. They suggest tags based on page text, anchor text, surrounding hosts, adding tags used by others users to label the URL. The effectiveness of this approach is also confirmed by the use of a large dataset crawled from del icio us for the experimental evalua- tion. A hybrid approach is also proposed by Lipczak in 9. Firstly, the system extracts tags from the title of the resource. Afterwards, based on an analysis usually co-occur with terms in the title. Finally, tags are filtered and reranked exploiting the information stored in a so-called"personomy'", the set of the tags previously used by the user. t Finaly, in (17]the authors proposed a model based on both textual content nd tags associated with the resource. They introduce the concept of conflate igs to indicate a set of related tag(like blog, blogs, ecc. )used to annotate a resource. Modeling in this way the existing tag space they are able to suggest various tags for a given bookmark exploiting both user and document models They win the previous edition of the Tag Recommendation Challenge 3 Description of the Task STar has been designed to participate at the ECML-PKDD 2009 Discovery Challenge. In this section we will firstly introduce a formal model for recom- mendation in folksonomies, then we will analyze the specific requirements of the task proposed for the Challenge http://www.kde.cs.uni-kassel.de/ws/dc09
Tag Suggestion Engine (TSE) which leverages previously tagged posts providing appropriate suggestions for new content. In [10] the tag recommendations task is performed through a user-based collaborative filtering approach. The method seems to produce good results when applied on the user-tag matrix, so they show that users with a similar tag vocabulary tend to tag alike. The problem of tag recommendation through graph-based approaches has been firstly addressed by J¨aschke et al. in [7]. They compared some recommendation techniques including collaborative filtering, PageRank and FolkRank. The key idea behind FolkRank algorithm is that a resource which is tagged by important tags from important users becomes important itself. The same concept holds for tags and users, thus the approach uses a graph whose vertices mutually reinforce themselves by spreading their weights. The evaluation showed that FolkRank outperforms other approaches. Schmitz et al. [15] proposed association rule mining as a technique that might be useful in the tag recommendation process. In literature we can find also some hybrid methods integrating two or more approaches (mainly, content and collaborative ones) in order to reduce their typical drawbacks and point out their qualities. Heymann et. al [6] present a tag recommender that exploits at the same time social knowledge and textual sources. They suggest tags based on page text, anchor text, surrounding hosts, adding tags used by others users to label the URL. The effectiveness of this approach is also confirmed by the use of a large dataset crawled from del.icio.us for the experimental evaluation. A hybrid approach is also proposed by Lipczak in [9]. Firstly, the system extracts tags from the title of the resource. Afterwards, based on an analysis of co-occurrences, the set of candidate tags is expanded adding also tags that usually co-occur with terms in the title. Finally, tags are filtered and reranked exploiting the information stored in a so-called ”personomy”, the set of the tags previously used by the user. Finally, in [17] the authors proposed a model based on both textual content and tags associated with the resource. They introduce the concept of conflated tags to indicate a set of related tag (like blog, blogs, ecc.) used to annotate a resource. Modeling in this way the existing tag space they are able to suggest various tags for a given bookmark exploiting both user and document models. They win the previous edition of the Tag Recommendation Challenge. 3 Description of the Task STaR has been designed to participate at the ECML-PKDD 2009 Discovery Challenge6 . In this section we will firstly introduce a formal model for recommendation in folksonomies, then we will analyze the specific requirements of the task proposed for the Challenge. 6 http://www.kde.cs.uni-kassel.de/ws/dc09
3.1 Recommendation in Folksonomies A collaborative tagging system is a platform composed of users, resources and tags that allows users to freely assign tags to resources. Following the definition introduced in 7, a folksonomy can be described as a triple (U,R, T)where -U is a set of users: R is a set of resource. T is a set of tags We can also define a tag assignment function TAS: UX R-T. The tag recommendation task for a given user uE U and a resource r E R can be finally described as the generation of a set of tags TAS(u, r)cT according to some relevance model In our approach these tags are generated from a ranked set of candidate tags from which the top n elements are suggested to the user 3.2 Description of the ECML-PKDD 2009 Discovery Challenge The 2009 edition of the Discovery Challenge consists of three recommendation tasks in the area of social bookmarking. We compete for the first task, content- based tag recommendation, whose goal is to exploit content-based recommenda- tion approaches in order to provide a relevant set of tags to the user when she submits a new item(Bookmark or BibTeX entry) into Bibsonomy tag as- signment: the dataset contains 263, 004 bookmark posts and 158, 924 Bib TeX en- tries submitted by 3, 617 different users. For each of the 235, 328 different URLs and the 143, 050 different BibTeX entries were also provided some textual meta- data(such as the title of the resource, the description, the abstract and so on) Each candidate recommender is evaluated by comparing the real tags(namely the tags a user adopts to annotate an unseen resource) with the suggested ones The accuracy is finally computed using classical IR metrics, such as Precision Recall and F1-Measure( Section 5.1) By analyzing the aforementioned requirements, we designed STaR thinking at a prediction task rather than a recommendation one. Consequently, we will try to emphasize the previous tagging activity of the user, also looking for connections and patterns among resources. All these decisions will be thoroughly analyzed in the next section describing the architecture of STaR 4 STaR: a Social Tag Recommender System STaR (Social Tag Recommender) is a content-based tag recommender system developed at the University of Bari. The inceptive idea behind STaR is to im- prove the model implemented in systems like Tag Assist [16 or AutoTag [12 Although we agree with the idea that resources with similar content could be annotated with similar tags, in our opinion Mishne's approach presents t portant drawbacks
3.1 Recommendation in Folksonomies A collaborative tagging system is a platform composed of users, resources and tags that allows users to freely assign tags to resources. Following the definition introduced in [7], a folksonomy can be described as a triple (U, R, T) where: – U is a set of users; – R is a set of resources; – T is a set of tags. We can also define a tag assignment function tas: U × R → T. The tag recommendation task for a given user u ∈ U and a resource r ∈ R can be finally described as the generation of a set of tags tas(u, r) ⊆ T according to some relevance model. In our approach these tags are generated from a ranked set of candidate tags from which the top n elements are suggested to the user. 3.2 Description of the ECML-PKDD 2009 Discovery Challenge The 2009 edition of the Discovery Challenge consists of three recommendation tasks in the area of social bookmarking. We compete for the first task, contentbased tag recommendation, whose goal is to exploit content-based recommendation approaches in order to provide a relevant set of tags to the user when she submits a new item (Bookmark or BibTeX entry) into Bibsonomy. The organizers make available a training set with some examples of tag assignment: the dataset contains 263,004 bookmark posts and 158,924 BibTeX entries submitted by 3,617 different users. For each of the 235,328 different URLs and the 143,050 different BibTeX entries were also provided some textual metadata (such as the title of the resource, the description, the abstract and so on). Each candidate recommender is evaluated by comparing the real tags (namely, the tags a user adopts to annotate an unseen resource) with the suggested ones. The accuracy is finally computed using classical IR metrics, such as Precision, Recall and F1-Measure (Section 5.1). By analyzing the aforementioned requirements, we designed STaR thinking at a prediction task rather than a recommendation one. Consequently, we will try to emphasize the previous tagging activity of the user, also looking for connections and patterns among resources. All these decisions will be thoroughly analyzed in the next section describing the architecture of STaR. 4 STaR: a Social Tag Recommender System STaR (Social Tag Recommender) is a content-based tag recommender system, developed at the University of Bari. The inceptive idea behind STaR is to improve the model implemented in systems like TagAssist [16] or AutoTag [12]. Although we agree with the idea that resources with similar content could be annotated with similar tags, in our opinion Mishne’s approach presents two important drawbacks:
Resource Candidat Recommendation Fig. 1. Architecture of STar 1. The tag reranking formula simply performs a sum of the occurrences of each tag among all the folksonomies, without considering the similarity with the esource to be tagged. In this way tags often used to annotate resources with a low similarity level could be ranked first 2. The proposed model does not take into account the previous tagging activ- ty performed by users. If two users bookmarked the same resource, the will receive the same suggestions since the folksonomies built from similar resources are the same We will try to overcome these draw backs, by proposing an approach based on the analysis of similar resources capable also of weighting more the tags alread selected by the user during her previous tagging activity. Figure 1 shows the general architecture of STaR. The recommendation process is performed in four steps, each of which is handled by a separate component 4.1 Indexing of Resources Given a collection of resources(corpus), a preprocessing step is performed by the Indexer module, which exploits Apache Lucene?' to perform the indexing step As regards bookmarks we indexed the title of the web page and the extende description provided by users. For the Bibtex entries we indexed the title of the publication and the abstract. Let u be the set of users and n the cardi nality of this set, the indexing procedure is repeated N+I times: we build an index for each user(Personal Inder) storing the information on her previously tagged resources and an index for the whole community( Social Inder) storing the information about all the resources previously tagged by the 7http://lucene.apache.org
Fig. 1. Architecture of STaR 1. The tag reranking formula simply performs a sum of the occurrences of each tag among all the folksonomies, without considering the similarity with the resource to be tagged. In this way tags often used to annotate resources with a low similarity level could be ranked first. 2. The proposed model does not take into account the previous tagging activity performed by users. If two users bookmarked the same resource, they will receive the same suggestions since the folksonomies built from similar resources are the same. We will try to overcome these drawbacks, by proposing an approach based on the analysis of similar resources capable also of weighting more the tags already selected by the user during her previous tagging activity. Figure 1 shows the general architecture of STaR. The recommendation process is performed in four steps, each of which is handled by a separate component. 4.1 Indexing of Resources Given a collection of resources (corpus), a preprocessing step is performed by the Indexer module, which exploits Apache Lucene7 to perform the indexing step. As regards bookmarks we indexed the title of the web page and the extended description provided by users. For the BibteX entries we indexed the title of the publication and the abstract. Let U be the set of users and N the cardinality of this set, the indexing procedure is repeated N + 1 times: we build an index for each user (Personal Index ) storing the information on her previously tagged resources and an index for the whole community (Social Index ) storing the information about all the resources previously tagged by the community. 7 http://lucene.apache.org
des following the definitions presented in Section 3. 1, given a useruEUwe U×R→T to a resource annotated by a given user. SocialInder represents the union of all the user personal indexes 4.2 Retrieving of Similar Resources At the end of the preprocessing step STar is able to take into account users requests. Every user interacts with STaR by providing information about a re- source to be tagged. In the Query Processing step the system acquires data about the user(her language, the tags she uses more, the number of tags she usually uses to annotate resources, etc. before processing(through the elimination of not useful characters and punctuation) and submitting the query against the SocialInder stored in Lucene. If the user is recognized by the system since it has previously tagged some other resources, the same query is submitted against her own PersonalInder, as well. We used as query the title of the web page (for bookmarks)or the title of the publication(for Bib TeX entries ). In order to improve the performances of the Lucene Querying Engine we replaced the original Lucene Scoring function with an Okapi BM25 implementation. BM25 is nowadays considered as one of the state-of-the art retrieval models by the Ir community [13] Let D be a corpus of documents, d E D, BM25 returns the top-k resources with the highest similarity value given a resource r(tokenized as a set of terms t1.tm), and is defined as follows: im(r ,d)=> idf(ti) k(1-b) where nf, represents the occurrences of the term ti in the document d, length is the length of the resource r and aug length is the average length of resources in the corpus. Finally, ki and b are two parameters typically set to 2.0 and 0.75 respectively, and idf (ti) represents the inverse document frequency of the term t if(4)=loN+可(ta)+0.5 djf(t1)+0.5 (4) ht
Following the definitions presented in Section 3.1, given a user u ∈ U we define P ersonalIndex(u) as: P ersonalIndex(u) = {r ∈ R|∃t ∈ T : tas(u, r) = t} (1) where tas is the tag assignment function tas: U × R → T which assigns tags to a resource annotated by a given user. SocialIndex represents the union of all the user personal indexes: SocialIndex = [ N i=1 P ersonalIndex(ui) (2) 4.2 Retrieving of Similar Resources At the end of the preprocessing step STaR is able to take into account users requests. Every user interacts with STaR by providing information about a resource to be tagged. In the Query Processing step the system acquires data about the user (her language, the tags she uses more, the number of tags she usually uses to annotate resources, etc.) before processing (through the elimination of not useful characters and punctuation) and submitting the query against the SocialIndex stored in Lucene. If the user is recognized by the system since it has previously tagged some other resources, the same query is submitted against her own PersonalIndex, as well. We used as query the title of the web page (for bookmarks) or the title of the publication (for BibTeX entries). In order to improve the performances of the Lucene Querying Engine we replaced the original Lucene Scoring function with an Okapi BM25 implementation8 . BM25 is nowadays considered as one of the state-of-the art retrieval models by the IR community [13]. Let D be a corpus of documents, d ∈ D, BM25 returns the top-k resources with the highest similarity value given a resource r (tokenized as a set of terms t1 . . . tm), and is defined as follows: sim(r, d) = Xm i=1 n r ti k1((1 − b) + b ∗ lengthr avgLengthr ) + n r ti ∗ idf(ti) (3) where n r ti represents the occurrences of the term ti in the document d, lengthr is the length of the resource r and avgLengthr is the average length of resources in the corpus. Finally, k1 and b are two parameters typically set to 2.0 and 0.75 respectively, and idf(ti) represents the inverse document frequency of the term ti defined as follows: idf(ti) = log N + df(ti) + 0.5 df(ti) + 0.5 (4) 8 http://nlp.uned.es/ jperezi/Lucene-BM25/
0.8 0.4 Similar Resources 0.7 Fig. 2. Retrieving of Similar Resources where N is the number of resources in the collection and df(ti) is the number of resources in which the term t: occurs. Given user u e U and a resource r. Lucene returns the resources whose similarity with r is greater or equal than a threshold B. To perform this task Lucene uses both the personallnder of the user u and the sociallnder, more Personal Res(u, q)=re PersonalInder(u)lsim(q, r)28(5) Sociales(q)={r∈ SocialIndea sim(q,r)≥B} Figure 2 depicts an example of the retrieving step. In this case the target resource is represented by Gazzetta. t, one of the most famous Italian sport newspaper. Lucene queries the SocialInder and returns as the most similar re- ources an online newspaper(Corrieredellosport it)and the official web site of an Italian Football Club(Inter. it ). The PersonalInder, instead, returns another online newspaper (Tuttosport. com). The similarity score returned by Lucene has been normalized 4.3 Extraction of Candidate Tags In the next step the Tag Extractor gets the most similar resources returned by the apache Lucene engine and produces the set of candidate tags to be sug- gested, by computing for each tag a score obtained by weighting the similarity
Fig. 2. Retrieving of Similar Resources where N is the number of resources in the collection and df(ti) is the number of resources in which the term ti occurs. Given user u ∈ U and a resource r, Lucene returns the resources whose similarity with r is greater or equal than a threshold β. To perform this task Lucene uses both the PersonalIndex of the user u and the SocialIndex. More formally: P ersonalRes(u, q) = {r ∈ P ersonalIndex(u)|sim(q, r) ≥ β} (5) SocialRes(q) = {r ∈ SocialIndex|sim(q, r) ≥ β} (6) Figure 2 depicts an example of the retrieving step. In this case the target resource is represented by Gazzetta.it, one of the most famous Italian sport newspaper. Lucene queries the SocialIndex and returns as the most similar resources an online newspaper (Corrieredellosport.it) and the official web site of an Italian Football Club (Inter.it). The PersonalIndex, instead, returns another online newspaper (Tuttosport.com). The similarity score returned by Lucene has been normalized. 4.3 Extraction of Candidate Tags In the next step the Tag Extractor gets the most similar resources returned by the Apache Lucene engine and produces the set of candidate tags to be suggested, by computing for each tag a score obtained by weighting the similarity
score returned by Lucene with the normalized occurrence of the tag. If the Tag Extractor also gets the list of the most similar resources from the user Person- weight to each folksonomy in order to boost users' previously used tag: Sning a Inder, it will produce two partial folksonomies that are merged, assig Formally, for each query q(namely, the resource to be tagged ), we can define a set of tags to recommend by building two sets: candTagsp and candTagsa These sets are defined as follows: candTagsp(u, q)=tETt= TAS(u, r)ArE Personal Res(u, 9) (7) candTags(q)={t∈Tt=TAS(u,r)Ar∈ Sociales(q)∧a∈U}(8) In the same way we can compute the relevance of each tag with respect to the query q as: Personal Res(u, q) nf sim(r, q) relp(t, u, q) rels(t, q)=<rESocialRes(a)nr sim(r, q) (10) where nf is the number of occurrences of the tag t in the annotation for resource r and n' is the sum of the occurrences of tag t among all similar resources Finally, the set of Candidate Tags can be defined as candTags(u, q)=candTagsp(u, q)UcandTagss(a where for each tag t the global relevance can be defined as rel(t, q)=a*relp(t, q)+(1-a)*rels (t, q) where a(PersonalTag Weight )and(1-a)(SocialTag Weight)are the weights of the personal and social tags respectively Figure 3 depicts the procedure performed by the Tag Extractor: in this case we have a set of 4 Social Tags(Newspaper, Online, Football and Inter) and 3 Personal Tags(Sport, Newspaper and Tuttosport ) These sets are then merged building the set of Candidate Tags. This set contains 6 tags since the tag new paper appears both in social and personal tags. The system associates a score to each tag that indicates its effectiveness for the target resource. Besides, the scores for the Candidate Tags are weighted again according to SocialTag Weight (a) and PersonalTag Weight(1-a) values(in the example, 0. 3 and 0.7 respec tively), in order to boost the tags already used by the user in the final tag rank Indeed, we can point out that the social tag 'football gets the same score of the personal tag 'tuttosport', although its original weight was twice
score returned by Lucene with the normalized occurrence of the tag. If the Tag Extractor also gets the list of the most similar resources from the user PersonalIndex, it will produce two partial folksonomies that are merged, assigning a weight to each folksonomy in order to boost users’ previously used tags. Formally, for each query q (namely, the resource to be tagged), we can define a set of tags to recommend by building two sets: candT agsp and candT agss. These sets are defined as follows: candT agsp(u, q) = {t ∈ T|t = T AS(u, r) ∧ r ∈ P ersonalRes(u, q)} (7) candT agss(q) = {t ∈ T|t = T AS(u, r) ∧ r ∈ SocialRes(q) ∧ u ∈ U} (8) In the same way we can compute the relevance of each tag with respect to the query q as: relp(t, u, q) = P r∈P ersonalRes(u,q) n t r ∗ sim(r, q) nt (9) rels(t, q) = P r∈SocialRes(q) n t r ∗ sim(r, q) nt (10) where n t r is the number of occurrences of the tag t in the annotation for resource r and n t is the sum of the occurrences of tag t among all similar resources. Finally, the set of Candidate Tags can be defined as: candT ags(u, q) = candT agsp(u, q) ∪ candT agss(q) (11) where for each tag t the global relevance can be defined as: rel(t, q) = α ∗ relp(t, q) + (1 − α) ∗ rels(t, q) (12) where α (PersonalTagWeight) and (1 − α) (SocialTagWeight) are the weights of the personal and social tags respectively. Figure 3 depicts the procedure performed by the Tag Extractor : in this case we have a set of 4 Social Tags (Newspaper, Online, Football and Inter) and 3 Personal Tags (Sport, Newspaper and Tuttosport). These sets are then merged, building the set of Candidate Tags. This set contains 6 tags since the tag newspaper appears both in social and personal tags. The system associates a score to each tag that indicates its effectiveness for the target resource. Besides, the scores for the Candidate Tags are weighted again according to SocialTagWeight (α) and PersonalTagWeight (1 − α) values (in the example, 0.3 and 0.7 respectively), in order to boost the tags already used by the user in the final tag rank. Indeed, we can point out that the social tag ‘football’ gets the same score of the personal tag ‘tuttosport’, although its original weight was twice
:0.8 °H+04 7 Fig 3. Description of the process performed by the Tag Extractor 4. 4 Tag re undation The Tag Extractor produces the set of the Candidate Tags, a ranked set of tags with their relevance scores. This set is exploited by the Filter, a component which performs the last step of the recommendation task, that is removing those tags not matching specific conditions: we fix a threshold for the relevance score between 0.20 to 0.25 and we return at most 5 tags. These parameters are strictly dependent from the training data Formally, given a useruEU, a query q and a threshold value y, the goal of the filtering component is to build recommendation(u, q) defined as follows recommendation(u, ) =t e candTags(u, grel(t, q)>7( In the example in Figure 3, setting a threshold y=0.20, the system would suggest the tags sport and newspaper. 5 Experimental Evaluations 5.1 Experimental Session In this experiment we measure the performance of STaR in the Task 1 of the ECML-PKDD 2009 Discovery Challenge. This experimental evaluation was car ried out according to the instructions provided from the organizers of the Chal- lenge 2009. The test set was released 48 hours before the end of the competition Every participant uploaded a file containing the tag predictions, and for each post only five tags were considered. F1-Measure was used to evaluate the accu- racy of recommendations, thus for each post Precision and Recall were computed by comparing the recommended tags with the true tags assigned by the users The case of tags was ignored and all characters which are neither numbers nor letters were removed. Results are presented in Table 1
Fig. 3. Description of the process performed by the Tag Extractor 4.4 Tag Recommendation The Tag Extractor produces the set of the Candidate Tags, a ranked set of tags with their relevance scores. This set is exploited by the Filter, a component which performs the last step of the recommendation task, that is removing those tags not matching specific conditions: we fix a threshold for the relevance score between 0.20 to 0.25 and we return at most 5 tags. These parameters are strictly dependent from the training data. Formally, given a user u ∈ U, a query q and a threshold value γ, the goal of the filtering component is to build recommendation(u, q) defined as follows: recommendation(u, q) = {t ∈ candT ags(u, q)|rel(t, q) > γ} (13) In the example in Figure 3, setting a threshold γ = 0.20, the system would suggest the tags sport and newspaper. 5 Experimental Evaluations 5.1 Experimental Session In this experiment we measure the performance of STaR in the Task 1 of the ECML-PKDD 2009 Discovery Challenge. This experimental evaluation was carried out according to the instructions provided from the organizers of the Challenge 2009. The test set was released 48 hours before the end of the competition. Every participant uploaded a file containing the tag predictions, and for each post only five tags were considered. F1-Measure was used to evaluate the accuracy of recommendations, thus for each post Precision and Recall were computed by comparing the recommended tags with the true tags assigned by the users. The case of tags was ignored and all characters which are neither numbers nor letters were removed. Results are presented in Table 1