正在加载图片...
000 NUMBER OF TAC Fig 1. Distribution of tags on pages in delicious dataset Table 2. Overview of algorithm setup popularity threshold 5%5%15%0.5%05% overlap threshold■10%5% average overlap(context) threshold: 6, 3, 3%Floating 6, 3, 3%floating atwww.fiit.stuba.sk/-barla/iccci09).Ourtaxonomiesconformroughlyto he criteria on estimating quality of a taxonomy defined in 9, where a high- quality taxonomy should have an average depth of 3 with a maximum depth of 5. On the delicious(1 and delicious(2 datasets, we can observe the impact of the overlap threshold. When set to 10%(delicious[1), the tagsonomy tends to be more fat with highly consistent subtrees. The 5% setup of delicious(2 led to organizing more tags into subtrees (i.e, less tags on the first level), but showed that the contextual threshold 3, 3% was setup weakly(only four tags were duplicated in order to keep up with context). The best results were achieved when contextual threshold was set as floating, according to current parent-child overlap. For instance, tag currency on the Fig 3 which was wrongly assigned in a branch audio/ conversion was moved into a separate subtree in order to keep the average overlap of tags in the audio branch on the higher level We were surprised by results coming from the CiteULike folksonomy. The first setup with 10% overlap threshold produced very small hierarchy with only 67 popular tags. It seems that in CiteULike, the crowd did not make an agreement on most appropriate tags for particular publication (i.e, everybody uses his or her own specific tags). One reason for such a difference could be that delicious is pro-actively supporting such an agreement to emerge by recommending tags10000 100000 1000000 GES 1 10 100 1000 1 5 25 125 625 NUMBER OF PA NUMBER OF TAGS Fig. 1. Distribution of tags on pages in delicious dataset. Table 2. Overview of algorithm setup. deli[1] deli[2] deli[3] cite[1] cite[2] cite[3] popularity threshold: 5% 5% 5% 1,5% 0,5% 0,5% overlap threshold: 10% 5% 5% 10% 5% 5% average overlap (context) threshold: 6,6% 3,3% floating 6,6% 3,3% floating at www.fiit.stuba.sk/~barla/iccci09). Our taxonomies conform roughly to the criteria on estimating quality of a taxonomy defined in [9], where a high￾quality taxonomy should have an average depth of 3 with a maximum depth of 5. On the delicious[1] and delicious[2] datasets, we can observe the impact of the overlap threshold. When set to 10% (delicious[1]), the tagsonomy tends to be more flat with highly consistent subtrees. The 5% setup of delicious[2] led to organizing more tags into subtrees (i.e., less tags on the first level), but showed that the contextual threshold 3,3% was setup weakly (only four tags were duplicated in order to keep up with context). The best results were achieved when contextual threshold was set as floating, according to current parent-child overlap. For instance, tag currency on the Fig. 3 which was wrongly assigned in a branch audio/conversion was moved into a separate subtree in order to keep the average overlap of tags in the audio branch on the higher level. We were surprised by results coming from the CiteULike folksonomy. The first setup with 10% overlap threshold produced very small hierarchy with only 67 popular tags. It seems that in CiteULike, the crowd did not make an agreement on most appropriate tags for particular publication (i.e., everybody uses his or her own specific tags). One reason for such a difference could be that delicious is pro-actively supporting such an agreement to emerge by recommending tags
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有