正在加载图片...
Social search (185), learning(175), and network(175). The average number of tag applications per paper was 3.35(the total tag applications divided by the total number of papers). The median and modal number of tag applications per paper were 2 and 1, resp The average number of tag applications per user was 16.03(the total tag applications divided by the total users). However, the median and modal number of tag applications per user was 4 and 1, respectively. These figures are close to the 222528 4952 ones for the MovieLens analysis, which reported an average of 18 tag applications per user with a median of 3 Figure 3. Number of users vs number of tag applications. Relatively In MovieLens, relatively few users generated few users generated most of the tag applications most of the tag applications, approximating a power-law distribution. CiteULike's data set is sim ilar, with y= 790.02r1.3484, R2=0.9225(the data add this paper as a favorite, users click on a link set included 1, 921 users for a range of 1 to 55 tag post a copy to your library" )that takes them to applications). Figure 3 shows the relationship a different tagging page(Figure 2b). On this page, between the number of users and the number of users can optionally tag the paper to add it to their tag applications ersonal collection. Users can create new tags(by We also computed the correlation between the typing them in a textbox), which might overlap number of papers each user tagged and the num- with existing tags others have used before, or they ber of distinct tags each user generated. The cor can select existing tags (clicking on a tag automat- relation is high(0.944), and is thus starkly different ally adds it to the textbox) but only ones from from those of other social bookmarking services their personal collections. Note that users don,t For example, in Dogear, the correlation between have the option to select a tag from everyone's tag the number of tags used and the number of book collection; if they want to do this, they have to marks created was 0.56, although it was higher for remember the tag that others used (from when they users with bookmark collections smaller than 10 first viewed the paper's link) and manually type it (0.74). For Flickr, the correlation between distinct in, which we'll discuss in more depth later tags and photos was 0.518, and for del icio us, no trong association existed between the number of General User Activity bookmarks users had created and the number of The analysis we describe here is based on data col- tags they used in those bookmarks. lected between 15 November 2004 and 13 Febru- The high correlation for CiteULike suggests a ary 2007. Although it would be interesting and strong linear relationship between the number of useful to run our analysis on the whole CiteULike papers and the number of distinct tags for each data set, because we are part of the CiteSeer user. This relationship could be due to the fact that research group, the underlying data set we had as users tag more papers, the number of tags in access to comprised only tag applications for their personal tag vocabulary increases. papers in CiteSeer that CiteULike indexes Our data set contained a total of 32, 242 tag Tag Growth applications, 2,011 distinct users, 9, 623 distinct Social bookmarking services premise is that users papers, and 6, 527 distinct tags. The two most pro- collaboratively generate and reuse tags. One way lific users had 3, 883 and 634 tag applications, to index collaboration in social bookmarking serv- while 42 users had 100 or more tag applications. ices is to look at how users create new tags over The two most tagged papers were both coauthored time. We categorized the number of new tags per by Larry Page, and were tagged 135 and 94 month, choosing months as the unit of temporal times, respectively. The five most frequently used analysis (a finer-grained denomination, such tags were clustering(245), p2p(220), logic days or weeks, would have resulted in too many www.computer.org/internet/ IEEE INTERNET COMPUTINGadd this paper as a favorite, users click on a link (“post a copy to your library”) that takes them to a different tagging page (Figure 2b). On this page, users can optionally tag the paper to add it to their personal collection. Users can create new tags (by typing them in a textbox), which might overlap with existing tags others have used before, or they can select existing tags (clicking on a tag automat￾ically adds it to the textbox) but only ones from their personal collections. Note that users don’t have the option to select a tag from everyone’s tag collection; if they want to do this, they have to remember the tag that others used (from when they first viewed the paper’s link) and manually type it in, which we’ll discuss in more depth later. General User Activity The analysis we describe here is based on data col￾lected between 15 November 2004 and 13 Febru￾ary 2007. Although it would be interesting and useful to run our analysis on the whole CiteULike data set, because we are part of the CiteSeer research group, the underlying data set we had access to comprised only tag applications for papers in CiteSeer that CiteULike indexes. Our data set contained a total of 32,242 tag applications, 2,011 distinct users, 9,623 distinct papers, and 6,527 distinct tags. The two most pro￾lific users had 3,883 and 634 tag applications, while 42 users had 100 or more tag applications. The two most tagged papers were both coauthored by Larry Page, 7,8 and were tagged 135 and 94 times, respectively. The five most frequently used tags were clustering (245), p2p (220), logic (185), learning (175), and network (175). The average number of tag applications per paper was 3.35 (the total tag applications divided by the total number of papers). The median and modal number of tag applications per paper were 2 and 1, respectively. The average number of tag applications per user was 16.03 (the total tag applications divided by the total users). However, the median and modal number of tag applications per user was 4 and 1, respectively. These figures are close to the ones for the MovieLens3 analysis, which reported an average of 18 tag applications per user with a median of 3. In MovieLens, relatively few users generated most of the tag applications, approximating a power-law distribution. CiteULike’s data set is sim￾ilar, with y = 790.02x–1.3484, R2 = 0.9225 (the data set included 1,921 users for a range of 1 to 55 tag applications). Figure 3 shows the relationship between the number of users and the number of tag applications. We also computed the correlation between the number of papers each user tagged and the num￾ber of distinct tags each user generated. The cor￾relation is high (0.944), and is thus starkly different from those of other social bookmarking services. For example, in Dogear, 6 the correlation between the number of tags used and the number of book￾marks created was 0.56, although it was higher for users with bookmark collections smaller than 10 (0.74). For Flickr, 5 the correlation between distinct tags and photos was 0.518, and for del.icio.us, 4 no strong association existed between the number of bookmarks users had created and the number of tags they used in those bookmarks. The high correlation for CiteULike suggests a strong linear relationship between the number of papers and the number of distinct tags for each user. This relationship could be due to the fact that as users tag more papers, the number of tags in their personal tag vocabulary increases. Tag Growth Social bookmarking services’ premise is that users collaboratively generate and reuse tags. One way to index collaboration in social bookmarking serv￾ices is to look at how users create new tags over time. We categorized the number of new tags per month, choosing months as the unit of temporal analysis (a finer-grained denomination, such as days or weeks, would have resulted in too many 18 www.computer.org/internet/ IEEE INTERNET COMPUTING Social Search Figure 3.Number of users vs. number of tag applications. Relatively few users generated most of the tag applications. 0 50 100 150 200 250 300 350 400 450 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 Tag applications Users
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有