正在加载图片...
O-taggers documents consistency personalized tag choices as Chirita et al.(2007) 1234567 suggest, but rather stem from the lack of guide 714 1661 555625686 lines and uniform tag suggestions that a book marking service could provide 2.2 Measuring tagging consistency 248 48.3 Traditional indexers aim for consistency, on the 47.1 45.4 basis that this will enhance document retrieval 44.4 (Leonard, 1975). Consistency is measured using 8665 43.5 experiments in which several people index the 7787942490 same documents--usually a small set of a few 123456789022 40.9 dozen documents. It is computed for pairs of in 39.7 38.8 deers, by formulae such as rollings(1981) 35948 37. Consistency(/,)=2C. where C is the number of tags(index terms)in dexers I, and 1, have in common and a and b is 7671 16506 the size of their tag sets respectively In our experiments, before computing the number of terms in common. we stem each tas with the Porter(1980)stemmer. For example, the 678 overlap C between the tag sets (complex systems, network, small world, and theoretical, small 10 12 29 28.8 world, networks, dynamics) consist of the two 30 10 279 tags (network, small world,, and the consistency 31 10 26.7 is2×2/(3+4)=0.57. To compute the overall consistency of a par 34 21.0 ticular indexer, this figure is averaged over all documents and co-indexers. There were no cases where the same user reassigned tags to the same average 7.5 37.7 articles, so computing intra-tagger consistency Table 1. Consistency of the most prolific and although interesting, was not impossible most consistent taggers To our knowledge. traditional indexing consis- yet been applied to col Note that traditionally much smaller data sets are laboratively tagged data. However, experiments used to assess consistency of human indexers, on determining tagging quality do follow the because such sets need to be created specifically same idea. For example Xu et al.(2006) define for the experiment. Collaborative tagging plat- an authority metric that assigns high scores to forms like CiteULike can be mined for large col- those users who match other users'choices on lections of this kind in natural settings the same documents. in order to eliminate the area of bioinformatics. To give an example, a 2.3 Consistency of CiteULike taggers document entitled Initial sequencing and com parative analysis of the mouse genome was In the collec 80 documents tagged by 332 tagged by eight users with a total of 22 tags. Four users described in Section 3.1, each tagger has 18 of them agreed on the tag mouse, but one used co-taggers on average, ranging from 2 to 129 the broader term rodents. Three agreed on the tag and has indexed I to 25 documents. For each genome, but one added genome paper, and an- user we compute the consistency with all other other used the more specific comparative genom- users who tagged the e document. Consis- ics. There are also cases when tags are written tency is then averaged across documents. We together, e.g. genomepaper, or with a prefix key found that the distribution of per-user consis- genome, or in a different grammatical form: se- tency resembles a power law with a few users quence vs sequencing. This example shows achieving high consistency values and a long tail many inconsistencies in tags are not caused of inconsistent taggers The maximum consisNote that traditionally much smaller data sets are used to assess consistency of human indexers, because such sets need to be created specifically for the experiment. Collaborative tagging plat￾forms like CiteULike can be mined for large col￾lections of this kind in natural settings. Most documents in the extracted set relate to the area of bioinformatics. To give an example, a document entitled Initial sequencing and com￾parative analysis of the mouse genome was tagged by eight users with a total of 22 tags. Four of them agreed on the tag mouse, but one used the broader term rodents. Three agreed on the tag genome, but one added genome paper, and an￾other used the more specific comparative genom￾ics. There are also cases when tags are written together, e.g. genomepaper, or with a prefix key genome, or in a different grammatical form: se￾quence vs. sequencing. This example shows that many inconsistencies in tags are not caused by personalized tag choices as Chirita et al. (2007) suggest, but rather stem from the lack of guide￾lines and uniform tag suggestions that a book￾marking service could provide. 2.2 Measuring tagging consistency Traditional indexers aim for consistency, on the basis that this will enhance document retrieval (Leonard, 1975). Consistency is measured using experiments in which several people index the same documents—usually a small set of a few dozen documents. It is computed for pairs of in￾dexers, by formulae such as Rolling’s (1981): , where C is the number of tags (index terms) in￾dexers I1 and I2 have in common and A and B is the size of their tag sets respectively. In our experiments, before computing the number of terms in common, we stem each tag with the Porter (1980) stemmer. For example, the overlap C between the tag sets {complex systems, network, small world} and {theoretical, small world, networks, dynamics} consist of the two tags {network, small world}, and the consistency is 2×2/(3+4) = 0.57. To compute the overall consistency of a par￾ticular indexer, this figure is averaged over all documents and co-indexers. There were no cases where the same user reassigned tags to the same articles, so computing intra-tagger consistency, although interesting, was not impossible. To our knowledge, traditional indexing consis￾tency metrics have not yet been applied to col￾laboratively tagged data. However, experiments on determining tagging quality do follow the same idea. For example, Xu et al. (2006) define an authority metric that assigns high scores to those users who match other users’ choices on the same documents, in order to eliminate spammers. 2.3 Consistency of CiteULike taggers In the collection of 180 documents tagged by 332 users described in Section 3.1, each tagger has 18 co-taggers on average, ranging from 2 to 129, and has indexed 1 to 25 documents. For each user we compute the consistency with all other users who tagged the same document. Consis￾tency is then averaged across documents. We found that the distribution of per-user consis￾tency resembles a power law with a few users achieving high consistency values and a long tail of inconsistent taggers. The maximum consis￾tagger co-taggers documents consistency 1 1 5 71.4 2 1 5 71.4 3 6 5 57.9 4 6 6 51.0 5 11 12 50.4 6 2 5 50.1 7 4 6 48.3 8 8 8 47.1 9 13 16 45.4 10 12 8 44.4 11 7 6 43.5 12 7 6 41.7 13 8 5 40.9 14 7 6 39.7 15 9 13 38.8 16 4 5 38.4 17 12 9 37.3 18 4 14 36.1 19 9 8 35.9 20 10 11 33.7 21 7 6 33.1 22 6 5 33.0 23 7 10 32.1 24 11 16 31.7 25 8 13 30.6 26 6 8 30.6 27 9 6 29.8 28 10 12 29.0 29 8 6 28.8 30 9 10 27.9 31 10 8 26.7 32 8 7 26.3 33 10 5 25.6 34 8 7 21.0 35 9 9 18.3 36 3 6 7.9 average 7.5 8.1 37.7 Table 1. Consistency of the most prolific and most consistent taggers 1320
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有