正在加载图片...
10000 Fig. 2. Distribution of tags on publications in CiteULike dataset Table 3. Attributes of the resulting hierarchies I deli[l] deli[2] deli[Cite[cite(2 cite[31 licated tags: st-level tags:3571722775483125 #child-less first-lev. tags verage depth: 1. 819422074206281.194217431:9539 maximum depth: 55 5255 when users are adding a particular page already present in the system, while CiteULike does not provide such a feature yet. Therefore, we decreased the required popularity to 0, 5% for the CiteULike dataset, which resulted in 505 orga Resulting taxonomies pointed out yet another interesting difference between delicious and CiteULike folksonomies: CiteULike folksonomy contains words con- sidered as English stop-words(and, on, the, for etc. ) and(moreover)these stop- words are popular. This was something we did not expect at all, that somebody would use stop-words to organize some content. A possible explanation is that CiteULike users tend to post a short sentence as one tag(e.g,"example of a graph analysis"), but the CiteULike system considers each word of a sentence as a t Another phenomenom of the CiteULike dataset is a popular no-tag tag, which is assigned automatically by the system if user enters a publication without any tags. Obviously, many people do use CiteULike without taking advantage of tagging system built in it, which was rather surprising finding10000 100000 1000000 R OF IONS 1 10 100 1000 1 10 100 1000 10000 NUMBER PUBLICATI NUMBER OF TAGS Fig. 2. Distribution of tags on publications in CiteULike dataset. Table 3. Attributes of the resulting hierarchies. deli[1] deli[2] deli[3] cite[1] cite[2] cite[3] #popular tags: 1085 1085 1085 67 505 505 #duplicated tags: 1 4 94 0 5 25 #first-level tags: 357 172 277 54 83 125 #child-less first-lev. tags: 245 84 92 47 38 51 average depth: 1.8194 2.2074 2.0628 1.194 2.1743 1.9539 maximum depth: 5 5 5 2 5 5 when users are adding a particular page already present in the system, while CiteULike does not provide such a feature yet. Therefore, we decreased the required popularity to 0,5% for the CiteULike dataset, which resulted in 505 organized tags. Resulting taxonomies pointed out yet another interesting difference between delicious and CiteULike folksonomies: CiteULike folksonomy contains words con￾sidered as English stop-words (and, on, the, for etc.) and (moreover) these stop￾words are popular. This was something we did not expect at all, that somebody would use stop-words to organize some content. A possible explanation is that CiteULike users tend to post a short sentence as one tag (e.g., “example of a graph analysis”), but the CiteULike system considers each word of a sentence as a tag. Another phenomenom of the CiteULike dataset is a popular no-tag tag, which is assigned automatically by the system if user enters a publication without any tags. Obviously, many people do use CiteULike without taking advantage of tagging system built in it, which was rather surprising finding
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有