正在加载图片...
chose the latent topic of emails. Then, the correlations between an email and the associated users(i.e. the author and the recipients were taken into account. Particularly, they suggested two models Table 1. Data Summary of Citeulike to extract communities one model was centered on each users contacts and another was centered on topic. When they 2643 the resultant communities with the group formation from another study as a ground truth, they found that their approach succeeded[Average No of Group per User to generate appropriate groups with high similarity in shared [Average No of Members per gI messages(Zhou, Manavoglu et al. 2006). The weakness of this Total No of Unique Items tudy is that the groups were inferred by machine leaming Average No of Items per Group 4589 technology, which was based on content similarities. Average No of Items per User 18824 The existing studies about group dynamics have largely concerned L Average No of Tags per Grou 103931 about the interactions only between/among group members or A erage No 6440 about the derived groups inferred by various machine leaning technologies(OHara, Alani et al. 2002; Backstrom, Huttenlocher et al. 2006). In what follows, we focus on users, self-defined roup activities and explore not only the information sharing dynamics among group members, but also interactions between a group and the group members 3. THE DATA SET 3.1 The Data Source and the relationship As a source of data for our study we selected a collaborative tagging system, Citeulike. Along with Bibsonomy (Hotho Jaschke et al. 2006)and Connotea (Lund, Hammond et al. 2005) Citeulike is one of the leading systems for managing and sharing bibliographic references. As many other collaborative tagging systems, Citeulike supports group activity. Users can create a Figure 1. Distribution of Group Members'Information group, join existing groups, or be invited to join the group. When Collection group members find interesting references, through the Citeulike interface, they are able to add them not only in their personal repositories, but also in the group space with tags at the same time. The updated list of references is shown to all other group members. The group members are able to copy references on the group collection to their personal repositories, as well. 3.2 Data Collection We collected the group data from Citeulike. As the first step, we visited the site in October and november of 2008. As of the time hen we visited, there was a page showing the list of groups. We chose all groups that were displayed on the page at the time of the visit and collected the groups collections, the group members and No, of Items the members personal collections. The information of each group included the bibliography(article title, list of authors, Figure 2. Distribution of Groups'Information Collection ference names, publication years, etc), the tags, and the and time. We collected the same kind of information 4096 from individual group members collection. Out of more than 700 groups, we filtered out single-member groups, groups having insufficient references(n< 5), and members who do not have any reference in their personal collection (n =0). Then the total number of groups was 619 and these groups have 337, 987 distinc items (ie. research pape ers). We had 2643 users and they made 3528 memberships as total. Each user is a member of 1.34 groups and each group has 5.7 members on average. Table I and Figure I and 2 show the summary of data set and the data distribution in 1 7 groups and group members'collections. The both figures display that the users and groups in our data set may have enough number No of Groups of items to compare the information sharing patterns. Figure 3 shows the number of groups that each user participates in and displays that most of users are members of one group Figure 3. Distribution of Group Memberships per Userchose the latent topic of emails. Then, the correlations between an email and the associated users (i.e. the author and the recipients) were taken into account. Particularly, they suggested two models to extract communities – one model was centered on each user’s contacts and another was centered on topic. When they compared the resultant communities with the group formation from another study as a ground truth, they found that their approach succeeded to generate appropriate groups with high similarity in shared messages (Zhou, Manavoglu et al. 2006). The weakness of this study is that the groups were inferred by machine learning technology, which was based on content similarities. The existing studies about group dynamics have largely concerned about the interactions only between/among group members or about the derived groups inferred by various machine learning technologies (O'Hara, Alani et al. 2002; Backstrom, Huttenlocher et al. 2006). In what follows, we focus on users’ self-defined group activities and explore not only the information sharing dynamics among group members, but also interactions between a group and the group members. 3. THE DATA SET 3.1 The Data Source and the Relationship As a source of data for our study we selected a collaborative tagging system, Citeulike. Along with Bibsonomy (Hotho, Jäschke et al. 2006) and Connotea (Lund, Hammond et al. 2005), Citeulike is one of the leading systems for managing and sharing bibliographic references. As many other collaborative tagging systems, Citeulike supports group activity. Users can create a group, join existing groups, or be invited to join the group. When group members find interesting references, through the Citeulike interface, they are able to add them not only in their personal repositories, but also in the group space with tags at the same time. The updated list of references is shown to all other group members. The group members are able to copy references on the group collection to their personal repositories, as well. 3.2 Data Collection We collected the group data from Citeulike. As the first step, we visited the site in October and November of 2008. As of the time when we visited, there was a page showing the list of groups. We chose all groups that were displayed on the page at the time of the visit and collected the groups’ collections, the group members and the members’ personal collections. The information of each group’ collection included the bibliography (article title, list of authors, journal/conference names, publication years, etc), the tags, and the posted date and time. We collected the same kind of information from individual group member’s collection. Out of more than 700 groups, we filtered out single-member groups, groups having insufficient references (n < 5), and members who do not have any reference in their personal collection (n = 0). Then the total number of groups was 619 and these groups have 337,987 distinct items (i.e. research papers). We had 2643 users and they made 3528 memberships as total. Each user is a member of 1.34 groups and each group has 5.7 members on average. Table 1 and Figure 1 and 2 show the summary of data set and the data distribution in groups and group members’ collections. The both figures display that the users and groups in our data set may have enough number of items to compare the information sharing patterns. Figure 3 shows the number of groups that each user participates in and displays that most of users are members of one group. Table 1. Data Summary of Citeulike Total No. of Groups 619 Total No. of Users 2643 Total No. of Group Memberships 3528 Average No. of Group per User 1.34 Average No. of Members per Group 5.70 Total No. of Unique Items 337987 Average No. of Items per Group 445.89 Average No. of Items per User 188.24 Average No. of Tags per Group 1039.31 Average No. of Tags per User 464.40 Figure 1. Distribution of Group Members’ Information Collection Figure 2. Distribution of Groups’ Information Collection Figure 3. Distribution of Group Memberships per User 1 2 4 8 16 32 64 128 256 512 1 8 64 512 4096 No. of Users No. of Items 1 2 4 8 16 1 8 64 512 4096 No. of Groups No. of Items 1 4 16 64 256 1024 4096 1 5 9 13 17 21 No. of Members No. of Groups
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有