正在加载图片...
lieve that these findings demonstrate that users with long e hope that our studies with Messenger data serves a onversations and many messages per conversation tend to an example of directions in social science research, highlight have smaller degrees-even given the findings displayed in ing how communication systems can provide insights about Figure 17, where we saw that removing these users is more high-level patterns and relationships in human communica- effective for breaking the connectivity of the network than tions without making incursions into the privacy of individ- for random node deletion. Figure 18 also shows that using uals. We hope that this first effort to understand a social the average number of messages per conversation as a crite- network on a genuinely planetary scale will embolden others rion removes edges in the slowest manner. We believe that to explore human behavior at large scales this makes sense intuitively If users invest similar amounts Acknowledgments of time to interacting with others, then people with short We thank Dan Liebling for help with generated world map conversations will tend to converse with more people given amount of time than users having long conversations. plots, and Dimitris Achlioptas and Susan Dumais for helpful uggestion 8. CONCLUSION 9.R RENCES 1 R. Albert, H. Jeong, and A.-L. Barabasi. Error and We have reviewed a set of results eration and analysis of an anonymize the communication patterns of all 2J. I. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and IM system. The methods and findi A. Vespignani. Analysis and visualization of large scale using a large IM network as a worldwide lens onto aggregate Systems, 2005 human behavior 图3] We described the creation of the dataset, capturing high- characteristics of instant messaging: effects and level communication activities and demographics in June predictions of interpersonal relationships. In Cscw 2006. The core dataset contains more than 30 billion conver. 4A origin of bursts and heavy tails in sations among 240 million people. We discussed the creation and analysis of a communication graph from the data con- 5 V aining 180 million nodes and 1.3 billion edges. The commu S/0202039).Feb200 nication network is largest social network analyzed to date et Analysis. Worldwide Enterprise In Messaging Applications 2005-2009 Forecast and The planetary-scale network allowed us to explore dependen the Decks for Substantial cies among user demographics, communication characteris- Growth. 2005 tics, and network structure. Working with such a massive 7 J. Leskovec and E. Horvitz. Worldwide Buzz: dataset allowed us to test hypotheses such as the average chain of separation among people across the entire world. ech. report MSR-TR-2006-186, 2006 of americans We discovered that the graph is well connected, highly ] transitive. and robust. We reviewed the influence of multi- . M. Co ple factors on communication frequency and duration. We Birds of a feather: Homophily in social networks. found strong influences of homophily in activities, where nnual Review of Sociology, 27(1): 415-444, 2001 people with similar characteristics tend to communicate more, I de. bradney with the exception of gender, where we found that cross- action In CSCW 00: Proceedings of the AcM gender conversations are both more frequent and of longer upported cooperative work, duration than conversations with users of the same reported gender. We also examined the path lengths and validated [11]E. Ravasz an of planetary scale earlier research that found 6 degree eparat 12E312201m%是E We note that the sheer size of the data limits the kinds Homophily-heterophily: Relational concepts for of analyses one can perform. In some cases, a smaller ran- communication research. Public Opinion Quarterly, dom sample may avoid the challenges with working with [13]X Shi, L.A.Adamic, and M.J.Strauss. Networks of corrupt the structural properties of networks, such as the de- ree distribution and the diameter of the graphs [15]. Thus, [141 hile sampling may be valuable for managing complexity social networks to personal behavior of analyses, results on network properties with partial data WWv8.2008 sets may be rendered unreliable. Furthermore. we need to [15 M. P Stumpf, C Wiuf, R. M. May. Subnets of consider the full data set to reliably measure the patterns of ge and distance homophily in communication (16) 5.pperties of networks. PNAS, 102(22d 2003 In other directions of research with the dataset. we have tual model for the internet topol pursued the use of machine learning and inference to learn [171 J. Travers and s. Milgram. An exper munication frequencies and durations of conversation nong [18 A. Voida, W. C. Newstetter, and E.D. Mynatt.When people as a function of the structural and demographic at- conventions collide: the tensions of instant messaging tributes of ersants.Our future directions for research (19 Dt. J. watts and s H. strogatz Colective dynamics of include gaining an understanding of the dynamics of the tructure of the communication network via a study of the 20 evolution of the network over time essaging traffic characteristics. In ICDCS 07, 200lieve that these findings demonstrate that users with long conversations and many messages per conversation tend to have smaller degrees—even given the findings displayed in Figure 17, where we saw that removing these users is more effective for breaking the connectivity of the network than for random node deletion. Figure 18 also shows that using the average number of messages per conversation as a crite￾rion removes edges in the slowest manner. We believe that this makes sense intuitively: If users invest similar amounts of time to interacting with others, then people with short conversations will tend to converse with more people in a given amount of time than users having long conversations. 8. CONCLUSION We have reviewed a set of results stemming from the gen￾eration and analysis of an anonymized dataset representing the communication patterns of all people using a popular IM system. The methods and findings highlight the value of using a large IM network as a worldwide lens onto aggregate human behavior. We described the creation of the dataset, capturing high￾level communication activities and demographics in June 2006. The core dataset contains more than 30 billion conver￾sations among 240 million people. We discussed the creation and analysis of a communication graph from the data con￾taining 180 million nodes and 1.3 billion edges. The commu￾nication network is largest social network analyzed to date. The planetary-scale network allowed us to explore dependen￾cies among user demographics, communication characteris￾tics, and network structure. Working with such a massive dataset allowed us to test hypotheses such as the average chain of separation among people across the entire world. We discovered that the graph is well connected, highly transitive, and robust. We reviewed the influence of multi￾ple factors on communication frequency and duration. We found strong influences of homophily in activities, where people with similar characteristics tend to communicate more, with the exception of gender, where we found that cross￾gender conversations are both more frequent and of longer duration than conversations with users of the same reported gender. We also examined the path lengths and validated on a planetary scale earlier research that found “6 degrees of separation” among people. We note that the sheer size of the data limits the kinds of analyses one can perform. In some cases, a smaller ran￾dom sample may avoid the challenges with working with terabytes of data. However, it is known that sampling can corrupt the structural properties of networks, such as the de￾gree distribution and the diameter of the graphs [15]. Thus, while sampling may be valuable for managing complexity of analyses, results on network properties with partial data sets may be rendered unreliable. Furthermore, we need to consider the full data set to reliably measure the patterns of age and distance homophily in communications. In other directions of research with the dataset, we have pursued the use of machine learning and inference to learn predictive models that can forecast such properties as com￾munication frequencies and durations of conversations among people as a function of the structural and demographic at￾tributes of conversants. Our future directions for research include gaining an understanding of the dynamics of the structure of the communication network via a study of the evolution of the network over time. We hope that our studies with Messenger data serves as an example of directions in social science research, highlight￾ing how communication systems can provide insights about high-level patterns and relationships in human communica￾tions without making incursions into the privacy of individ￾uals. We hope that this first effort to understand a social network on a genuinely planetary scale will embolden others to explore human behavior at large scales. Acknowledgments We thank Dan Liebling for help with generated world map plots, and Dimitris Achlioptas and Susan Dumais for helpful suggestions. 9. REFERENCES [1] R. Albert, H. Jeong, and A.-L. Barabasi. Error and attack tolerance of complex networks. Nature, 406:378, 2000. [2] J. I. Alvarez-Hamelin, L. Dall’Asta, A. Barrat, and A. Vespignani. Analysis and visualization of large scale networks using the k-core decomposition. In ECCS ’05: European Conference on Complex Systems, 2005. [3] D. Avrahami and S. E. Hudson. Communication characteristics of instant messaging: effects and predictions of interpersonal relationships. In CSCW ’06, pages 505–514, 2006. [4] A.-L. Barabasi. The origin of bursts and heavy tails in human dynamics. Nature, 435:207, 2005. [5] V. Batagelj and M. Zaversnik. Generalized cores. ArXiv, (cs.DS/0202039), Feb 2002. [6] IDC Market Analysis. Worldwide Enterprise Instant Messaging Applications 2005–2009 Forecast and 2004 Vendor Shares: Clearing the Decks for Substantial Growth. 2005. [7] J. Leskovec and E. Horvitz. Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network. Tech. report MSR-TR-2006-186, 2006. [8] P. V. Marsden. Core discussion networks of americans. American Sociological Review, 52(1):122–131, 1987. [9] M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1):415–444, 2001. [10] B. A. Nardi, S. Whittaker, and E. Bradner. Interaction and outeraction: instant messaging in action. In CSCW ’00: Proceedings of the 2000 ACM conference on Computer supported cooperative work, pages 79–88, 2000. [11] E. Ravasz and A.-L. Barabasi. Hierarchical organization in complex networks. Physical Review E, 67(2):026112, 2003. [12] E. M. Rogers and D. K. Bhowmik. Homophily-heterophily: Relational concepts for communication research. Public Opinion Quarterly, 34:523–538, 1970. [13] X. Shi, L. A. Adamic, and M. J. Strauss. Networks of strong ties. Physica A Statistical Mechanics and its Applications, 378:33–47, May 2007. [14] P. Singla and M. Richardson. Yes, there is a correlation - from social networks to personal behavior on the web. In WWW ’08, 2008. [15] M. P. Stumpf, C. Wiuf, R. M. May. Subnets of scale-free networks are not scale-free: sampling properties of networks. PNAS, 102(12), 2005. [16] S. L. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the internet topology. In GLOBECOM ’01, vol. 3, pages 1667 – 1671, 2001. [17] J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry, 32(4), 1969. [18] A. Voida, W. C. Newstetter, and E. D. Mynatt. When conventions collide: the tensions of instant messaging attributed. In CHI ’02, pages 187–194, 2002. [19] D. J. Watts and S. H. Strogatz. Collective dynamics of ’small-world’ networks. Nature, 393:440–442, 1998. [20] Z. Xiao, L. Guo, and J. Tracey. Understanding instant messaging traffic characteristics. In ICDCS ’07, 2007
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有