正在加载图片...
Female Male Figure 2: (a) Distribution of the number of people anticipating in a conversation.(b) Distribution of pread of du tions can be described by a power-law distribution. Figure 4: World and Messenger user population pyramid. Ages 15-30 are overrepresented in nger pop lation Figure 3:(a) Distribution of login duration. (b) Duration of times when people are not logged into the system(times between logout and login) Figure 5: Temporal characteristics of conversations We consider event distributions on a per-user basis in Fig (a) Average conversation duration per user;(b)time ure 1. The number of logins per user, displayed in Fig- etween conversations of users re 1(a), follows a heavy-tailed distribution with exponent 3.6. We note spikes in logins at 20 minute and 15 second atervals, which correspond to an auto-login function of the Focusing on the differences by gender, ger population es are overrep- resented for the 10-14 age interval. F 下m users. we see contact lists rather quickly. The spike at 600 buddies un- overall matches with the world population for age spans 10- doubtedly reflects the maximal allowed length of contac 14 and 35U39: for women users, we see a match for ages in Figure 2(a)displays the number of users per session. In the span of 30-34. We note that 6.5% of the population did Iessenger, multiple people can participate in conversations. not submit an age when creating their Messenger accounts people who can participate simultaneously in a conversa- 4. COMMUNICATION CHARACTERISTICS tion. Figure 2(b) shows the distribution over the session We now focus on characteristics and patterns durations, which can be modeled by a power-law distribu- munications. We limit the analysis to conversations between tion with exponent 3.6. two participants, which account for 99% of all conversations Next, we examine the distribution of the durations of pe. We first examine the distributions over conversation du- riods of time when people are logged on to the system. Let rations and times between conversations. Let user u have (tij, toj) denote a time ordered (tij< toj< tij+1)sequence C conversations in the observation period. Then, for every of online and offline times of a user, where ti, is the time conversation i of user u we create a tuple(tsu, i, teu, i, mu, i) where ts,: denotes the start time of the conversation, te Figure 3(a) plots the distribution of toj -tij over all j over is the end time of the conversation, and mu.i is the numbe users. Similarly, Figure 3(b) shows the distribution of of exchanged messages between the two users. We order the the periods of time when users are logged off, i.e. tij+1-to, conversations by their start time(tsu, i tsu, i +1).Then, over all j and over all Fitting the data to power-law calculate the aver distributions reveals exponents of 1.77 and 1.3, respectively. ration d(u)=2 teu, i-tsu, i, where the sum goes over The data shows that durations of being online tend to be all the u's conversations Figure 5(a) shows the distribution shorter and decay faster than durations that users are of of d(u) over all the users u. We find that the conversation fine. We also notice periodic effects of login durations of length can be described by a heavy-tailed distribution with 12, 24, and 48 hours, reflecting daily periodicities. We ob- exponent-3.7 and a mode of 4 minutes. serve similar periodicities for logout durations at multiples Figure 5(b)shows the intervals between consecutive con- of 24 hours 3.2 Demographic characteristics of the users tsu. i, where tsu. i+1 and tsu. i denote start times of two con- secutive conversations of user u. The power-law exponent of We compared the demographic characteristics of the Mes. the distribution over intervals is -1.5. This result is sim- senger population with 2005 world census data and found ilar to the temporal distribution for other kinds of human fferences between the statistics for age and gender. The communication activities, e.g., waiting times of emails and visualization of this comparison displayed in Figure 4 shows letters before a reply is generated 4. The exponent can be that users with reported ages in the 15-35 span of years are explained by a priority-queue model where tasks of different100 101 102 103 104 105 106 107 108 109 100 101 102 Count Number of users per session ∝ x-3.5 20 102 103 104 105 106 107 108 109 1010 1011 100 101 102 Count Conversation duration ∝ x-3.67 Figure 2: (a) Distribution of the number of people participating in a conversation. (b) Distribution of the durations of conversations. The spread of dura￾tions can be described by a power-law distribution. 100 101 102 102 103 104 105 106 login duration count Data = 9.7e5 x−1.77 R2 =1.00 100 101 102 103 104 105 106 logout duration count Data = 6.9e5 x−1.34 R2 =1.00 Figure 3: (a) Distribution of login duration. (b) Duration of times when people are not logged into the system (times between logout and login). We consider event distributions on a per-user basis in Fig￾ure 1. The number of logins per user, displayed in Fig￾ure 1(a), follows a heavy-tailed distribution with exponent 3.6. We note spikes in logins at 20 minute and 15 second intervals, which correspond to an auto-login function of the IM client. As shown in Figure 1(b), many users fill up their contact lists rather quickly. The spike at 600 buddies un￾doubtedly reflects the maximal allowed length of contact lists. Figure 2(a) displays the number of users per session. In Messenger, multiple people can participate in conversations. We observe a peak at 20 users, the limit on the number of people who can participate simultaneously in a conversa￾tion. Figure 2(b) shows the distribution over the session durations, which can be modeled by a power-law distribu￾tion with exponent 3.6. Next, we examine the distribution of the durations of pe￾riods of time when people are logged on to the system. Let (tij , toj ) denote a time ordered (tij < toj < tij+1) sequence of online and offline times of a user, where tij is the time of the jth login, and toj is the corresponding logout time. Figure 3(a) plots the distribution of toj − tij over all j over all users. Similarly, Figure 3(b) shows the distribution of the periods of time when users are logged off, i.e. tij+1 −toj over all j and over all users. Fitting the data to power-law distributions reveals exponents of 1.77 and 1.3, respectively. The data shows that durations of being online tend to be shorter and decay faster than durations that users are of- fline. We also notice periodic effects of login durations of 12, 24, and 48 hours, reflecting daily periodicities. We ob￾serve similar periodicities for logout durations at multiples of 24 hours. 3.2 Demographic characteristics of the users We compared the demographic characteristics of the Mes￾senger population with 2005 world census data and found differences between the statistics for age and gender. The visualization of this comparison displayed in Figure 4 shows that users with reported ages in the 15–35 span of years are 0.1 0.05 0 0.05 0.1 0−4 5−9 10−14 15−19 20−24 25−29 30−34 35−39 40−44 45−49 50−54 55−59 60−64 65−69 70−74 75−79 80−84 85−89 90−94 95−99 100+ Female Male proportion of the population age World population MSN population Figure 4: World and Messenger user population age pyramid. Ages 15–30 are overrepresented in the Messenger population. 100 102 104 100 105 1010 conversation duration [min] count Data = 1.5e11 x−3.70 R2 =0.99 100 105 104 106 108 time between conversations [min] count Data = 3.9e9 x−1.53 R2 =0.99 1 day 2 days 3 days Figure 5: Temporal characteristics of conversations. (a) Average conversation duration per user; (b) time between conversations of users. strongly overrepresented in the active Messenger population. Focusing on the differences by gender, females are overrep￾resented for the 10–14 age interval. For male users, we see overall matches with the world population for age spans 10– 14 and 35U39; for women users, we see a match for ages in ˚ the span of 30–34. We note that 6.5% of the population did not submit an age when creating their Messenger accounts. 4. COMMUNICATION CHARACTERISTICS We now focus on characteristics and patterns with com￾munications. We limit the analysis to conversations between two participants, which account for 99% of all conversations. We first examine the distributions over conversation du￾rations and times between conversations. Let user u have C conversations in the observation period. Then, for every conversation i of user u we create a tuple (tsu,i, teu,i, mu,i), where tsu,i denotes the start time of the conversation, teu,i is the end time of the conversation, and mu,i is the number of exchanged messages between the two users. We order the conversations by their start time (tsu,i < tsu,i+1). Then, for every user u, we calculate the average conversation du￾ration d¯(u) = 1 C P i teu,i − tsu,i, where the sum goes over all the u’s conversations. Figure 5(a) shows the distribution of ¯d(u) over all the users u. We find that the conversation length can be described by a heavy-tailed distribution with exponent -3.7 and a mode of 4 minutes. Figure 5(b) shows the intervals between consecutive con￾versations of a user. We plot the distribution of tsu,i+1 − tsu,i, where tsu,i+1 and tsu,i denote start times of two con￾secutive conversations of user u. The power-law exponent of the distribution over intervals is − 1.5. This result is sim￾ilar to the temporal distribution for other kinds of human communication activities, e.g., waiting times of emails and letters before a reply is generated [4]. The exponent can be explained by a priority-queue model where tasks of different
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有