The Small-World phenomenon An Algorithmic Perspective Jon Kleinberg f Abstract ility. More generally, we provide a strong characteri- zation of this family of network models, showing that Long a matter of folklore, the "small-world phenomenon" there is in fact a unique model within the family fo the principle that we are all linked by short chains of which decentralized algorithms are effective acquaintances- was inaugurated as an area of experi mental study in the social sciences through the pioneer ing work of Stanley Milgram in the 1960s. This work 1 Introduction was among the first to make the phenomenon quantita- tive, allowing people to speak of the "six degrees of sep- aration"between any two people in the United States The Small-World Phenomenon A social network ex Since then, a number of network models have been pro- hibits the small-world phenomenon if, roughly speak posed as frameworks in which to study the problem an ing, any two individuals in the network are likely to alytically. One of the most refined of these models was be connected through a short sequence of intermediate formulated in recent work of Watts and Strogatz; their acquaintances. This has long been the subject of anec- framework provided compelling evidence that the small- and discover that we have an acquaintance in common world phenomenon is pervasive in a range of network arising in nature and technology, and a fundamental in- It has since grown into a significant area of study in the gredient in the evolution of the World Wide Web social sciences, in large part through a series of strik- But existing models are insufficient to explain the ng experiments conducted by Stanley Milgram and his striking algorithmic component of Milgram's original co-workers in the 1960 s[13, 18, 12. Recent work has suggested that the phenomenon is pervasive in networks findings:that individuals using local information are arising in nature and technology, and a fundamental in- collectively very effective at actually constructing short paths between two points in a social network. Although gredient in the structural evolution of the World wide recently proposed network models are rich in short paths, Web [17, 19, 2 Milgram's basic small-world experiment remains one local information only, can construct short paths in these of the most compelling ways to think about the prob- etworks with non-negligible probability. We then de- lem. The goal of the experiment was to find short chains fine an infinite family of network models that naturally of acquaintances linking pairs of people in the United generalizes the Watts-Strogatz model, and show that States who did not know one another. In a typical in- for one of these models, there is a decentralized algo- stance of the experiment urce person in Nebraska rithm capable of finding short paths with high proba would be given a letter to deliver to a target person in Massachusetts. The source would initially be told ba A version of this work Technical Report 99-1776(October 1999) as Cornell Computer Science sic information about the target, including his address t Department of Computer S Cornell University, Ithac and occupation; the source would then be instructed to NY 14853. Email: kleiner@cs. cornell.edu. Supported in part send the letter to someone she knew on a first-name by a David and Lucile Packard Foundation Fellowship an Al- basis in an effort to transmit the letter to the target as red P. Sloan Research Fellowship, an ONR Young Investigator efficaciously as possible. Anyone subsequently receiving Award, and NSF Faculty Early Career Development Award CCr- the letter would be given the same instructions, and the 9701399 chain of communication would continue until the target Permission t of all or part of this work for was reached. Over many trials, the average number of out fee provided that copies intermediate steps in a successful chain was found to lie between five and six, a quantity that has since en on the first page. To copy therwise, to republish to post on servers or to redistribute to lists tered popular culture as the "six degrees ' of separation mission and/or a fee principle [7] STOC 2000 Portland Oregon US. Copyright ACM2000158113-184400/5,5500
The Small-World Phenomenon: An Algorithmic Perspective * Jon Kleinberg * Abstract Long a matter of folklore, the "small-world phenomenon" -- the principle that we are all linked by short chains of acquaintances -- was inaugurated as an area of experimental study in the social sciences through the pioneering work of Stanley Milgram in the 1960's. This work was among the first to make the phenomenon quantitative, allowing people to speak of the "six degrees of separation" between any two people in the United States. Since then, a number of network models have been proposed as frameworks in which to study the problem analytically. One of the most refined of these models was formulated in recent work of Watts and Strogatz; their framework provided compelling evidence that the smallworld phenomenon is p@rvasive in a range of networks arising in nature and technology, and a fundamental ingredient in the evolution of the World Wide Web. But existing models are insufficient to explain the striking algorithmic component of Milgram's original findings: that individuals using local information are collectively very effective at actually constructing short paths between two points in a social network. Although recently proposed network models are rich in short paths, we prove that no decentralized algorithm, operating with local information only, can construct short paths in these networks with non-negligible probability. We then define an infinite family of network models that naturally generalizes the Watts-Strogatz model, and show that for one of these models, there is a decentralized algorithm capable of finding short paths with high proba- *A version of this work appears as Cornell Computer Science Technical Report 99-1776 (October 1999). tDepartment of Computer Science, Cornell University, Ithaca NY 14853. Email: kleinber@cs.cornell.edu. Supported in part by a David and Lucile Packard Foundation Fellowship, an Alfred P. Sloan Research Fellowship, an ONR Young Investigator Award, and NSF Faculty Early Career Development Award CCR- 9701399. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed tbr profit or commercial advantage and that copies bear this notice and the lull citation on tile first page. To copy otherwise, to republish, to post on sem, ers or to redistribute to lists, requires prior specific permission and/or a fee. STOC 2000 Portland Oregon USA Copyright ACM 2000 1-58113-184-4/00/5...$5.00 bility. More generally, we provide a strong characterization of this family of network models, showing that there is in fact a unique model within the family for which decentralized algorithms are effective. 1 Introduction The Small-World Phenomenon. A social network exhibits the small-world phenomenon if, roughly speaking, any two individuals in the network are likely to be connected through a short sequence of intermediate acquaintances. This has long been the subject of anecdotal observation and folklore; often we meet a stranger and discover that we have an acquaintance in common. It has since grown into a significant area of study in the social sciences, in large part through a series of striking experiments conducted by Stanley Milgram and his co-workers in the 1960's [13, 18, 12]. Recent work has suggested that the phenomenon is pervasive in networks arising in nature and technology, and a fundamental ingredient in the structural evolution of the World Wide Web [17, 19,.2]. Milgram's basic small-world experiment remains one of the most compelling ways to think about the problem. The goal of the experiment was to find short chains of acquaintances linking pairs of people in the United States who did not know one another. In a typical instance of the experiment, a source person in Nebraska would be given a letter to deliver to a target person in Massachusetts. The source would initially be told basic information about the target, including his address and occupation; the source would then be instructed to send the letter to someone she knew on a first-name basis in an effort to transmit the letter to the target as efficacious!y as possible. Anyone subsequently receiving the letter would be given the same instructions, and the chain of communication would continue until the target was reached. Over many trials, the average number of intermediate steps in a successful chain was found to lie between five and six, a quantity that has since entered popular culture as the "six degrees 'of separation" principle [7]. 163
Modeling the Phenomenon. Naturally, the empirical been investigated in the area of probabilistic combina. validation of the phenomenon has led to a rash of an- tories. In a fundamental instance of such an approach alytical work aimed at answering the following general Bollobas and Chung [5 gave bounds on the diameter of the random graph obtained by adding a random match to the nodes of a cycle. ( Se (*)Why should there earist short chains of ac- quaintance linking together arbitrary pairs of The Present Work. Let us return to Milgram's exper- iment. We claim that it really contains two fundamen tally surprising discoveries: first, that such short chains Most of the early work on this issue, beginning with should ezist in the network of acquaintanceships; and analysis of Pool and Kochen that pre-dated Mil 's second, that people should be able to find these chains experiments [16, was based on versions of the follow knowing so little about the target individual. From an ing explanation: random networks have low diameter. analytical point of view, the first of these discoveries is (See for example the book of surveys edited by Kochen existential in nature, the second algorithmic-it reveals [11]. )That is, if every individual in the United States that individuals who know only the locations of their were to have a small number of acquaintances selected direct acquaintances can still, collectively, construct a uniformly at random from the population-and if ac- short path between two points in the network. We there- quaintanceship were symmetric-then two random in- fore propose to study the following natural companion dividuals would be linked by a short chain with high to Question(*)above probability. Even this early work recognized the lim (**)Why should arbitrary pairs of strangers itations of a uniform random model if a and b are be able to find short chains of acquaintances two individuals with a common friend, it is much more that link them together? likely that they themselves are friends. But at the same time, a network of acquaintanceships that is too"cl It is important to note that Question(**)raises issues tered"will not have the low diameter that Milgram s that lie truly beyond the scope of Question(*):one experiments indicated can imagine networks in which short chains exist, but ecently, Watts and Strogatz proposed a model no mechanism based on purely local information is able the small-world phenomenon based on a class of random to find them. The success of Milgram's experiment sug- gests a source of latent navigational"cues"embedded in networks that interpolates between these two extremes, the underlying social network, by which a message could in which the edges of the network are divided into"lo- cal"and"long-range"contacts (19. The paradigmatic implicitly be guided quickly from source to target. It is example they studied was a"re-1 natural to ask what properties a social network must in order for it to exhibit such cues, and structed roughly as follows. One starts with a set v of points spaced uniformly on a circle, and joins each its members to find short chains through it In this work, we study "decentralized"algorithms by a small constant k. These are the "local contacts "in which individuals, knowing only the locations of their direct acquaintances, attempt to transmit a message the network. One then introduces a small number of from a source to a target along a short path. Our central edges in which the endpoints are chosen uniformly at V- the "long-range contacts". Watts and Strogatz argued that such a model captures two First, we show that existing models are insufficient crucial parameters of social networks: there is a sim- to explain the success of such decentralized algo- ple underlying structure that explains the presence of rithms in finding short paths through a social net- most edges, but a few edges are produced by a ran work. In a class of networks generated according dom process that does not respect this structure. Their to the model of Watts and Strogatz, we prove that networks thus have low diameter (like uniform random there is no decentralized algorithm capable of con- networks), but also have the property that many of the structing paths of small expected length(relative neighbors of a node u are themselves neighbors(unlike to the diameter of the underlying network uniform random networks). They showed that a number We then define an infinite family of random net- of naturally arising networks exhibit this pair of prop- k models that naturally generalizes the Watts- erties(including the connections among neurons in the nematode species C elegans, and the power grid of the Strogatz model. We show that for one of these 'estern U.S.; and their approach has been applied to models, there is a decentralized algorithm capable the analysis of the hyperlink graph of the World wide of finding short paths with high probability Web as well [1] Finally, we prove the stronger statement that there etworks that are formed from a superposition of a is in fact a unique model within the family for which “ structured subgraph”anda" random subgraph”have decentralized algorithms are effective
Modeling the Phenomenon. Naturally, the empirical validation of the phenomenon has led to a rash of analytical work aimed at answering the following general question: (*) Why should there exist short chains of acquaintances linking together arbitrary pairs of strangers? Most of the early work on this issue, beginning with analysis of Pool and Kochen that pre-dated Milgram's experiments [16], was based on versions of the following explanation: random networks have low diameter. (See for example the book of surveys edited by Kochen [11].) That is, if every individual in the United States were to have a small number of acquaintances selected uniformly at random from the population -- and if acquaintanceship were symmetric -- then two random individuals would be linked by a short chain with high probability. Even this early work recognized the limitations of a uniform random model; if A and B are two individuals with a common friend, it is much more likely that they themselves are friends. But at the same time, a network of acquaintanceships that is too "clustered" willnot have the low diameter that Milgram's experiments indicated. Recently, Watts and Strogatz proposed a model for the small-world phenomenon based on a class of random networks that interpolates between these two extremes, in which the edges of the network are divided into "local" and "long-range" contacts [19]. The paradigmatic example they studied was a "re-wired ring lattice," constructed roughly as follows. One starts with a set V of n points spaced uniformly on a circle, and joins each point by an edge to each of its k nearest neighbors, for a small constant k. These are the "local contacts" in the network. One then introduces a small number of edges in which the endpoints are chosen uniformly at random from V -- the "long-range contacts". Watts and Strogatz argued that such a model captures two crucial parameters of social networks: there is a simple underlying structure that explains the presence of most edges, but a few edges are produced by a random process that does not respect this structure. Their networks thus have low diameter (like uniform random networks), but also have the property that many of the neighbors of a node u are themselves neighbors (unlike uniform random networks). They showed that a number of naturally arising networks exhibit this pair of properties (including the connections among neurons in the nematode species C. elegans, and the power grid of the Western U.S.); and their approach has been applied to the analysis of the hyperlink graph of the World Wide Web as well [1]. Networks that are formed from a superposition of a "structured subgraph" and a "random subgraph" have been investigated in the area of probabilistic combinatorics. In a fundamental instance of such an approach, Bollob£s and Chung [5] gave bounds on the diameter of the random graph obtained by adding a random matching to the nodes of a cycle. (See also [6].) The Present Work. Let us return to Milgram's experiment. We claim that it really contains two fundamentally surprising discoveries: first, that such short chains should exist in the network of acquaintanceships; and second, that people should be able to find these chains knowing so little about the target individual. From an analytical point of view, the first of these discoveries is existential in nature, the second algorithmic-- it reveals that individuals who know only the locations of their direct acquaintances can still, collectively, construct a short path between two points in the network. We therefore propose to study the following natural companion to Question (*) above: (**) Why should arbitrary pairs of strangers be able to find short chains of acquaintances that link them together? It is important to note that Question (**) raises issues that lie truly beyond the scope of Question (*): one can imagine networks in which short chains exist, but no mechanism based on purely local information is able to find them. The success of Milgram's experiment suggests a source of latent navigational "cues" embedded in the underlying social network, by which a message could implicitly be guided quickly from source to target. It is natural to ask what properties a social network must possess in order for it to exhibit such cues, and enable its members to find short chains through it. In this work, we study "decentralized" algorithms by which individuals, knowing only the locations of their direct acquaintances, attempt to transmit a message from a source to a target along a short path. Our central findings are the following. • First, we show that existing models are insufficient to explain the success of such decentralized algorithms in finding short paths through a social network. In a class of networks generated according to the model of Watts and Strogatz, we prove that there is no decentralized algorithm capable of constructing paths of small expected length (relative to the diameter of the underlying network). • We then define an infinite family of random network models that naturally generalizes the WattsStrogatz model. We show that for one of these models, there is a decentralized algorithm capable of finding short paths with high probability. • Finally, we prove the stronger statement that there is in fact a unique model within the family for which decentralized algorithms are effective. 164
The Model: Networks and Decentralized Algorithms.(ii)the locations and long-range contacts of all nodes 7e now give precise definitions for our network model have come in contact with the message and our notion of a decentralized algorithm; we the provide formal statements of the main result Crucially, u does not have knowledge of the long-range In designing our network model, we seek a simple contacts of nodes that have not touched the message framework that encapsulates the paradigm of Watts and Given this, u must choose one of its contacts v, and Strogatz -rich in local connections, with a few long- forward the message to this contact. The expected de range connections. Rather than using a ring as the ba- livery time of a decentralized algorithm- a primary sic structure, however, we begin from a two-dimensional hgure of merit in our analysis-is the expected num- grid and allow for edges to be directed. Thus, we begin ber of steps taken by the algorithm to deliver the mes- with a set of nodes(representing individuals in the so- sage over a network generated according to an inverse ial network)that are identified with the set of lattice rlm-power distribution, from a source to a target chosen points in an n x n square,{(,j):i∈{1,2,…,n},j 1, 2, .., n), and we define the lattice distance between constraining the algorithm to use only local information two nodes(i,])and(k, e)to be the number of "lattice Is crucial to our model; if one had full global knowledge steps"separating them: d((i, 1),(k, e))=k-il +le-jl of the local and long-range contacts of all nodes in the For a universal constant p> 1, the node u has a directed network, the shortest chain between two nodes could be dge to every other node within lattice distance p- computed simply by breadth-first search these are its local contacts. For universal constants q >0 The reader may worry that assumption (iii)above and r 20, we also construct directed edges from u to g gives a decentralized algorithm too much power.How ther nodes(the long-range contacts)using independen ever, it only strengthens our results: our lower bounds random trials; the ith directed edge from u has endpoint will hold even for algorithms that are given this knowl u with probability proportional to (d(u, u)-.To ob- edge, while our upper bounds make use of decentralized tain a probability distribution, we divide this quantity algorithms that only require assumptions()and(ii) by the appropriate normalizing constant 2,[d(u, v)I- Statement of Results. Our results explore the way in we will call this the inverse rtn-power distribution which the structure of the network affects the ability of This model has eographIc a decentralized algorithm to construct a short path tion: individuals live on a grid and know their neighbors When r=0- the uniform distribution over long for some number of steps in all directions; they also have range contacts- standard results from random graph some number of acquaintances distributed more broadly theory can be used to show that with high probabil- across the grid. Viewing p and q as fixed constants, ity there exist paths between every pair of nodes whose ve obtain a one-parameter family of network models lengths are bounded by a polynomial in log n, exponen by tuning the value of the exponent r. When r=0, tially smaller than the total number of nodes. However we have the uniform distribution over long-range con- there is no way for a decentralized algorithm to find tacts, the distribution used in the basic network model these chains of Watts and Strogatz - s long-range contacts are chosen independently of their position on the grid. As Theorem 1 There is a constant ao, depending on p nd g but independent of n, so that when r=0, the ea- more and more clustered in its vicinity on the grid. pected delivery time of any decentralized algorithm is at Thus, r serves as a basic structural parameter measur- ing how widely "networked"the underlying society of mum path length. odes is The algorithmic component of the model is based As the parameter r increases, a decentralized algo- on Milgram's experiment. We start with two arbitrary rithm can take more advantage of the"geographic struc- ture!"implicit in the long-range contacts; at the same nodes in the network, denoted s and t; the goal is to transmit a message from s to t in as few steps as po time, long-range contacts become less useful in movin the message a large distance. There is a value of r where sible. We study decentralized algorithms, mechanism Thereby the message is passed sequentially from a cur- this trade-off can be best exploited algorithmically; this holder to one of its(loe is r=2, the inverse-square distribution. contacts, using only local information. In particular, Theorem 2 There is a decentralized algorithm A and the message holder u in a given step has knowledge of a constant a2, independent of n, so that when r=2 and (i)the set of local contacts among all nodes(i. e. the p=g= 1, the expected delivery time of A is at most underlying grid structure) This pair of theorems refiects a fundamental (ii)the location, on the lattice, of the target t; and sequence of our model. When long-range contacts 165
The Model: Networks and Decentralized Algorithms. We now give precise definitions for our network model and our notion of a decentralized algorithm; we then provide formal statements of the main results. In designing our network model, we seek a simple framework that encapsulates the paradigm of Watts and Strogatz -- rich in local connections, with a few longrange connections. Rather than using a ring as the basic structure, however, we begin from a two-dimensional grid and allow for edges to be directed. Thus, we begin with a set of nodes (representing individuals in the social network) that are identified with the set of lattice points in an n x n square, {(i,j): i e {1,2,...,n},j e {1, 2,..., n}}, and we define the lattice distance between two nodes (i, j) and (k, g) to be the number of "lattice steps" separating them: d((i, j), (k, g)) = Ik - i I + Ig-jl. For a universal constant p > 1, the node u has a directed edge to every other node within lattice distance p -- these are its local contacts. For universal constants q > 0 and r _> 0, we also construct directed edges from u to q other nodes (the long-range contacts) using independent random trials; the i TM directed edge from u has endpoint v with probability proportional to [d(u, v)] -r. (To obtain a probability distribution, we divide this quantity by the appropriate normalizing constant }-iv [d(u, v)]-r; we will call this the inverse rth-power distribution.) This model has a simple "geographic" interpretation: individuals live on ~ grid and know their neighbors for some number of steps in all directions; they also have some number of acquaintances distributed more broadly across the grid. Viewing p and q as fixed constants, we obtain a one-parameter family of network models by tuning the value of the exponent r. When r = 0, we have the uniform distribution over long-range contacts, the distribution used in the basic network model of Watts and Strogatz -- one's long-range contacts are chosen independently of their position on the grid. As r increases, the long-range contacts of a node become more and more clustered in its vicinity on the grid. Thus, r serves as a basic structural parameter measuring how widely "networked" the underlying society of nodes is. The algorithmic component of the model is based on Milgram's experiment. We start with two arbitrary nodes in the network, denoted s and t; the goal is to transmit a message from s to t in as few steps as possible. We study decentralized algorithms, mechanisms whereby the message is passed sequentially from a current message holder to one of its (local or long-range) contacts, using only local information. In particular, the message holder u in a given step has knowledge of (i) the set of local contacts among all nodes (i.e. the underlying grid structure); (ii) the location, on the lattice, of the target t; and (iii) the locations and long-range contacts of all nodes that have come in contact with the message. Crucially, u does not have knowledge of the long-range contacts of nodes that have not touched the message. Given this, u must choose one of its contacts v, and forward the message to this contact. The expected delivery time of a decentralized algorithm -- a primary figure of merit in our analysis -- is the expected number of steps taken by the algorithm to deliver the message over a network generated according to an inverse rth-power distribution, from a source to a target chosen uniformly at random from the set of nodes. Of course, constraining the algorithm to use only local information is crucial to our model; if one had full global knowledge of the local and long-range contacts of all nodes in the network, the shortest chain between two nodes could be computed simply by breadth-first search. The reader may worry that assumption (iii) above gives a decentralized algorithm too much power. However, it only strengthens our results: our lower bounds will hold even for algorithms that are given this knowledge, while our upper bounds make use of decentralized algorithms that only require assumptions (i) and (ii). Statement of Results. Our results explore the way in which the structure of the network affects the ability of a decentralized algorithm to construct a short path. When r = 0 -- the uniform distribution over longrange contacts -- standard results from random graph theory can be used to show that with high probability there exist paths between every pair of nodes whose lengths are bounded by a polynomial in log n, exponentially smaller than the total number of nodes. However, there is no way for a decentralized algorithm to find these chains: Theorem 1 There is a constant C~o, depending on p and q but independent of n, so that when r = O, the expected delivery time of any decentralized algorithm is at least c~on 2/3. (Hence exponential in the expected minimum path length.) As the parameter r increases, a decentralized algorithm can take more advantage of the "geographic structure" implicit in the long-range contacts; at the same time, long-range contacts become less useful in moving the message a large distance. There is a value of r where this trade-off can be best exploited algorithmically; this is r = 2, the inverse-square distribution. Theorem 2 There is a decentralized algorithm .4 and a constant c~2, independent of n, so that when r = 2 and p = q = 1, the expected delivery time of .4 is at most a2(log n) 2 . This pair of theorems reflects a fundamental consequence of our model. When long-range contacts are 165
o。oO0O . O9/OO Figure 1:(A)a two-dimensional grid network with n= 6, p= l, and q=0.(B) The contacts of a node u witl and g= 2 v and w are the two long-range contacts formed independently of the geometry of the grid, short of U, and if the message is never passed from a node to chains will exist but the nodes, operating at a local level, a long-range contact in U, the number of steps needed will not be able to find them. When long-range contacts to reach t will be at least proportional to n 2/3. But the are formed by a process that is related to the geometry probability that any message holder has a long-range of the grid in a specific way, however, then short chains contact in U is roughly n-2/3, so the expected numbe will still form and nodes operating with local knowledge of steps before a long-range contact in U is found is at will be able to construct them least proportional to n2/ as well We now comment on the ideas underlying the proofs of these results; the full details are given in the sub- More generally, we can show a suron characteriz sequent sections. The decentralized algorithm A that tion theorem for this family of models: r= 2 is the only achieves the bound of Theorem 2 is the following simple value for which there is a decentralized algorithm capa rule: in each step, the current message-holder u chooses ble of producing chains whose length is a polynomial in a contact that is as close to the target t as possible, in the sense of lattice distance. Note that algorithm A Theorem 3(a)Let0Srne of any decentralized algorithm is sage holders. To analyze an execution of algorithm A, (b) Let r>2. There is a constant ar, depending we say that it is in phase j if the lattice distance from the on p, r, but independent of n, so that the expected current message holder to the target is between 23and delivery time of any decentralized algorithm is at least We show that in phase j, the expected fore the current message holder has a long-range contact within lattice distance 23 of t is bounded proportionally tion 3. The proof of(a)is analogous to that of The- to log n; at this point, phase j will come to an end. As orem 1. The proof of(b), on the other hand, exposes there are at most 1+log n phases, a bound proportional a "dual" obstacle for decentralized algorithms: with a to(logn)2 follows. Interestingly, the analysis matches large value of T, it takes a significant amount of time a sho before the message reaches a node with a long range chain is found in real life: "The geographic movement of contact that is far away in lattice distance. This ef- message from Nebraska to Massachusetts is strik fectively limits the "speed"at which the message can ing. There is a progressive closing in on the target area travel from s to t as each new person is added to the chain"[13 Although we have focused on the two-dimensional The impossibility result of Theorem 1 is based, fun- grid, our analysis can be applied more broadly. We can damentally, on the fact that the uniform distribution generalize our results to k-dimensional lattice networks prevents a decentralized algorithm from using any clues" for constant values of k, as well as less structured graphs provided by the geometry of the grid. Roughly, we con- with analogous scaling properties. In the k-dimensional sider the set U of all nodes within lattice distance n2/3 case, a decentralized algorithm can construct paths of of t. With high probability, the source s will lie outside length polynomial in log n if and only if r= k
A) B) 0 0 0 0 0 0 0 0 0 0 0 0 0 v Ow 0 0 0 0 0 0 0 0 0 0 0 0 Figure 1: (A) A two-dimensional grid network with n = 6, p = 1, and q = 0. (B) The contacts of a node u with p = 1 and q = 2. v and w are the two long-range contacts. formed independently of the geometry of the grid, short chains will exist but the nodes, operating at a local level, will not be able to find them. When long-range contacts are formed by a process that is related to the geometry of the grid in a specific way, however, then short chains will still form and nodes operating with local knowledge will be able to construct them. We now comment on the ideas underlying the proofs of these results; the full details are given in the subsequent sections. The decentralized algorithm .4 that achieves the bound of Theorem 2 is the following simple rule: in each step, the current message-holder u chooses a contact that is as close to the target t as possible, in the sense of lattice distance. Note that algorithm A makes use of even less information than is allowed by our general model: the current message holder does not need to know anything about the set of previous message holders. To analyze an execution of algorithm .4, we say that it is in phase j if the lattice distance from the current message holder to the target is between 2 j and 2 j+l. We show that in phase j, the expected time before the current message holder has a long-range contact within lattice distance 2 j of t is bounded proportionally to logn; at this point, phase j will come to an end. As there are at most 1 +log n phases, a bound proportional to (logn) 2 follows. Interestingly, the analysis matches our intuition, and Milgram's description, of how a short chain is found in real life: "The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain" [13]. The impossibility result of Theorem 1 is based, fundamentally, on the fact that the uniform distribution prevents a decentralized algorithm from using any "clues" provided by the geometry of the grid. Roughly, we consider the set U of all nodes within lattice distance n 2/3 of t. With high probability, the source s will lie outside of U, and if the message is never passed from a node to a long-range contact in U, the number of steps needed to reach t will be at least proportional to n 2/3. But the probability that any message holder has a long-range contact in U is roughly n -2/3, so the expected number of steps before a long-range contact in U is found is at least proportional to n 2/3 as well. More generally, we can show a strong characterization theorem for this family of models: r = 2 is the only value for which there is a decentralized algorithm capable of producing chains whose length is a polynomial in log n: Theorem 3 (a) Let 0 2. There is a constant c~r, depending on p, q, r, but independent of n, so that the expected delivery time of any decentralized algorithm is at least OLrn(r--2)/(r--1). The complete proof of this theorem is given in Section 3. The proof of (a) is analogous to that of Theorem 1. The proof of (b), on the other hand, exposes a "dual" obstacle for decentralized algorithms: with a large value of r, it takes a significant amount of time before the message reaches a node with a long-range contact that is far away in lattice distance. This effectively limits the "speed" at which the message can travel from s to t. Although we have focused on the two-dimensional grid, our analysis can be applied more broadly. We can generalize our results to k-dimensional lattice networks, for constant values of k, as well as less structured graphs with analogous scaling properties. In the k-dimensional case, a decentralized algorithm can construct paths of length polynomial in log n if and only if r = k. 166
lower bound t (given as log T) 0 clustering exponent r Figure 2: The lower bound implied by Theorem 3. The c-axis is the value of r; the y-axis is the resulting exponent The results suggest a fundamental network property, simulations of communication in an abstract social net- distinct from diameter, that helps to explain the success work in which each individual was given pre-defined ac- of small-world experiments. One could think of it as the curacy and responsiveness parameters. The distinction transmission rate"of a class of networks: the minimum between the mere existence of short paths linking points expected delivery time of any decentralized algorithm on the World Wide Web, and the ability of agents to find operating in a random network drawn from this class them, has also been raised recently in work of Albert, Thus we see that minimizing the transmission rate of Jeong, and Barabasi 12, 9 a network is not necessarily the same as minimizing its diameter. This may seem counter-intuitive at first, but 2 Upper Bound for the Inverse-Square Distribution in fact it formalizes a notion raised initially - in ad- We now present proofs of the theorems discussed in the latent structural cues that can be used to guide a mes- previous section. When we analyze a decentralized al- sage towards a target. The dependence of long-range gorithm, we can adopt the following equivalent formula- connections on the geometry of the lattice is providing tion of the model, which will make the exposition easier precisely such implicit information Although our model considers all long-range contacts as being generated initially, at random, we invoke the Other Related Work. There has been work aimed at Principle of Deferred Decisions"-a common mech- modeling the way in which individuals in Milgram's ex. assume that the long-range contacts of hms/14)-and ed in spirit to what we do here, though erated only when the message first reaches v. Since a decentralized algorithm does not learn the long-range ing very different perspectives and models. Killworth contacts of v until the message reaches v, this formula and Bernard [10, in their "reverse small-world experi ments, "asked a set of respondents to explain how they tion is equivalent for the purposes of analysis chose to send letters in a run of the small-world ex A comment on the notation: log n denotes the loga- periment, and used this information to look f rithm base 2, while In n denotes the natural logarithm mon principles at an empirical level. At an analyti base e cal level, White [20] investigated the probability that Proof of Theorem 2. Sinc 1, we have a a chain would "die out"through an individuals fail- network in which each node u is connected to its four ure to participate, and Hunter and Shotland [8]studied earest neighbors in the lattice(two or three neighb the passage of a chain through different social "cate- in the case of nodes on the boundary), and has a single ories. In the context of a "referral system"for the long-range contact v. The probability that u chooses u
lower bound T on delivery time (given as log n T) .8 .6 .4. .2 m 0 1 2 3 4 clustering exponent r Figure 2: The lower bound implied by Theorem 3. The x-axis is the value of r; the y-axis is the resulting exponent on n. The results suggest a fundamental network property, distinct from diameter, that helps to explain the success of small-world experiments. One could think of it as the "transmiSsion rate" of a class of networks: the minimum expected delivery time of any decentralized algorithm operating in a random network drawn from this class. Thus we see that minimizing the transmission rate of a network is not necessarily the same as minimizing its diameter. This may seem counter-intuitive at first, but in fact it formalizes a notion raised initially -- in addition to having short paths, a network should contain latent structural cues that can be used to guide a message towards a target. The dependence of long-range connections on the geometry of the lattice is providing precisely such implicit information. Other Related Work. There has been work aimed at modeling the way in which individuals in Milgram's experiments chose recipients for their letters. Some of this work is related in spirit to what we do here, though using very different perspectives and models. Killworth and Bernard [10], in their "reverse small-world experiments," asked a set of respondents to explain how they chose to send letters in a run of the small-world experiment, and used this information to look for common principles at an empirical level. At an analytical level, White [20] investigated the probability that a chain would "die out" through an individual's failure to participate, and Hunter and Shotland [8] studied the passage of a chain through different social "categories." In the context of a "referral system" for the World Wide Web, Kautz, Selman, and Shah [17] ran simulations of communication in an abstract social network in which each individual was given pre-defined accuracy and responsiveness parameters. The distinction between the mere existence of short paths linking points on the World Wide Web, and the ability of agents to find them, has also been raised recently in work of Albert, Jeong, and Barabasi [2, 9]. 2 Upper Bound for the Inverse-Square Distribution We now present proofs of the theorems discussed in the previous section. When we analyze a decentralized algorithm, we can adopt the following equivalent formulation of the model, which will make the exposition easier. Although our model considers all long-range contacts as being generated initially, at random, we invoke the "Principle of Deferred Decisions" -- a common mechanism for analyzing randomized algorithms [14] -- and assume that the long-range contacts of a node v are generated only when the message first reaches v. Since a decentralized algorithm does not learn the long-range contacts of v until the message reaches v, this formulation is equivalent for the purposes of analysis. A comment on the notation: logn denotes the logarithm base 2, while inn denotes the natural logarithm, base e. Proof of Theorem 2. Since p = q = 1, we have a network in which each node u is connected to its four nearest neighbors in the lattice (two or three neighbors in the case of nodes on the boundary), and has a single long-range contact v. The probability that u chooses v as its long-range contact is d(u, v)-2/~v¢~ d(u, v) -2, 167
EX, 0, we say that the 3 Lower Bounds for Other Distributions execution of A is in phase j when the lattice distance We first expand our model of a decentralized algorithm from the current node to t is greater than 23 and at most 23+I. We say A is in phase 0 when the lattice distance slightly; it will correspondingly strengthen the result to to t is at most 2. Thus, the initial value of j is at show a lower bound for this new model. An algorithm initially has knowledge of the grid structure, all the local to the target decreases strictly in each step, each node contacts, and the locations of s and t. In step i, some that becomes the message holder has not touched the the algorithm has knowledge of all long-range contacts message before; thus, we may assume that the lon range contact from the message holder is generated at of all nodes in Si.(Following our style of analysis, the this long-range contacts of other nodes will be constructed Suppose we are in phase j, log(log n)<3<logn, only as the message reaches them. Based on this in and the current message holder is u. What is the prob- formation, it chooses any contact v of any node in Si ability that phase j will end in this step? This requires that has not yet received the message- v need not be the message to enter the set B, of nodes within lattice a contact of the current message holder- and it sends distance 2] of t. There are at least the message to U. The set Si+1 thus contains one el- ement more than Si, and the algorithm iterates. This is the same as our initial model of a decentralized algo- rithm, except that we do not count steps in which the ag nodes in B,, each is within lattice distance 2J+I+23< a node that has already received it 23+2 of u, and hence each has a probability of at least For technical reasons, we will add one additional fea- (4In(6n)223+4)-of being the long-range contact of u ture to the algorithms we consider. An algorithm will If any of these nodes is the long-range contact of u, it run for an infinite sequence of steps; initially it behaves will be u's closest neighbor to t; thus the message enters as above, and once the message reaches t, the message B, with probability at least remains at t in all subsequent steps. Thus, when we consider the ith step of a given algorithm, we need not 22-1 worry that it has already terminated by this step 4In( 128ln(6n We now prove the two parts of Theorem 3; note that part(a) implies Theorem 1 by setting r=0. A Let X, denote total number of steps spent in phase in Section 2 we will invoke the Principle of Deferred 3, log(log n)<j <log n. We have Decisions [14 in the analysi EX=∑Pr{x≥引 Proof of Theorem 3a. We consider an arbitrary de- centralized algorithm of the type described above, and consider the expected number of steps required for the message to travel from s to t, for nodes s and t generated 128ln(6n) uniformly at random from the grid =128ln(67) Note that because we have the freedom to choose the onstant r, we may also assume that n is at least as An analogous set of bounds shows that EX, s 128 In(6n) large as some fixed absolute constant no. The probabil- for j=log n as well. Finally, if 0< i s log(log n), then ity that a node u chooses v as its ith out of q long-range
and we have d(u, v) -2 v¢u 2n-2 - 0, we say that the execution of .4 is in phase j when the lattice distance from the current node to t is greater than 2J and at most 2 j+l. We say .4 is in phase 0 when the lattice distance to t is at most 2. Thus, the initial value of j is at most log n. Now, because the distance from the message to the target decreases strictly in each step, each node that becomes the message holder has not touched the message before; thus, we may assume that the longrange contact from the message holder is generated at this moment. Suppose we are in phase j, log(10gn) 22J -1 i=l nodes in Bj, each is within lattice distance 2 j+l -[- 2 j i l i=1 < 1 128 ln(6n) i=1 = 1281n(6n). An analogous set of bounds shows that EXj < 128 ln(6n) for j = log n as well. Finally, if 0 < j < log(log n), then EXj < 128 ln(6n) holds for the simple reason that the algorithm can spend at most log n steps in phase j even if all nodes pass the message to a local contact. Now, if X denotes the total number of steps spent by the algorithm, we have log n x= xj, j=O and so by linearity of expectation we have EX <_ (1 + logn)(1281n(6n)) <_ a2(logn) 2 for a suitable choice of Ct 2 . 3 Lower Bounds for Other Distributions We first expand our model of a decentralized algorithm slightly; it will correspondingly strengthen the result to show a lower bound for this new model. An algorithm initially has knowledge of the grid structure, all the local contacts, and the locations of s and t. In step i, some set Si of nodes has touched the message. At this point, the algorithm has knowledge of all long-range contacts of all nodes in S~. (Following our style of analysis, the long-range contacts of other nodes will be constructed only as the message reaches them.) Based on this information, it chooses any contact v of any node in Si that has not yet received the message -- v need not be a contact of the current message holder -- and it sends the message to v. The set Si+l thus contains one element more than S~, and the algorithm iterates. This is the same as our initial model of a decentralized algorithm, except that we do not count steps in which the algorithm "backtracks" by sending the message through a node that has already received it. For technical' reasons, we will add one additional feature to the algorithms we consider. An algorithm will run for an infinite sequence of steps; initially it behaves as above, and once the message reaches t, the message remains at t in all subsequent steps. Thus, when we consider the ith step of a given algorithm, we need not worry that it has already terminated by this step. We now prove the two parts of Theorem 3; note that part (a) implies Theorem 1 by setting r = 0. As in Section 2, we will invoke the Principle of Deferred Decisions [14] in the analysis. Proof of Theorem 3a. We consider an arbitrary decentralized algorithm of the type described above, and consider the expected number of steps required for the message to travel from s to t, for nodes s and t generated uniformly at random from the grid. Note that because we have the freedom to choose the constant c~r, we may also assume that n is at least as large as some fixed absolute constant no. The probability that a node u chooses v as its ith out of q long-range 168
contacts is d(u,u)r/∑≠ud(,v)-r, and we have Finally, let X denote the random variable equal to the number of steps taken for the message to reach and let e denote the event that the message reaches t ∑(u”≥∑0)(-) within And steps. We claim that if F occurs and E/ does not occur, then E cannot occur. For suppose it does. Since d(s, t)>n/4>pAn, in any s-t path of at most An steps, the message must be passed at least once from a node to a long-range contact. Moreover, the final time this happens, the long-range contact must lie in U. This contradicts our assumption that ei does not occur ≥(2-r)-1(n/2)2-7-1) Thus, Pr[E FAE]=0, hence E[X I> An. Since (2-r) where the last line follows if we assume n>23-, Let EX2EX| FAEn. Pr[FA≥m, part (a)of the theorem follows Let u denote the set of nodes within lattice dista pmn°oft. Note that Proof of Theorem 3b. We now turn to part(b)of the theorem, when r >2. Again we consider an arbi trary decentralized algorithm; and again, as necessary, U≤1+∑4≤4p2n2, we may assume that n is larger than some fixed absolute constant no. We write E T-2. Consider a node u and let v be a randomly generated long-range contact of where we assume n is large enough that pm> 2. Define v. The normalizing constant for the inverse rth-power A=((2-7)27-qp2)-1. Let 8 be the event that within distribution is at least 1 so for any m, we have And steps, the message reaches a node other than t with a long-range contact in U. Let e be the event that in step i, the message reaches a node other than t with a Prd(u,)>m≤∑(41)-) long-range contact in U; thus 8=UE.Now,the node reached at step i has q long-range contacts that are generated at random when it is encountered; so we p Let E be the event that in ste a node d(u, v)>nT. Let&=0 E! be the event that this Since the probability of a union of events is bounded by i 4.Let X denote the by a lattice distance of at least n /4. One can verify that random variable equal to the number of steps taken for Pr[F]>2. Since Pr[v8]si+4, Pr[FAE>4. the message to reach t, and let e denote the event that
contacts is d(u, v)-r / ~-~,¢~ d(u, v) -~, and we have n/2 d(u,v)-" >_ ~-~(j)(j-~) v~u j=l n/2 j=l n/2 Z 1-r dx J1 _> (2- r)-l((n/2) 2-r- 1) 1 > • n 2-r, - where the last line follows if we assume n > 23-~. Let = (2 - Let U denote the set of nodes within lattice distance pn ~ of t. Note that pn 6 IU[ _ 2. Define A = ((2-r)27-~qp2) -1. Let g' be the event that within An ~ steps, the message reaches a node other than t with a long-range contact in U. Let g~ be the event that in step i, the message reaches a node other than t with a long-range contact in U; thus g' = U g~" Now, the i ~. Since er [~V 8'] -4" ! Finally, let X denote the random variable equal to the number of steps taken for the message to reach t, and let £ denote the event that the message reaches t within An ~ steps. We claim that if 9 v occurs and 8' does not occur, then g cannot occur. For suppose it does. Since d(s,t) >_ n/4 > pan ~, in any s-t path of at most An ~ steps, the message must be passed at least once from a node to a long-range contact. Moreover, the final time this happens, the long-range contact must lie in U. This contradicts our assumption that g ' does not Occur. Thus, Pr [£ I ~" A £--7] = 0, hence E [X [ 5c A £---7] > An ~. Since 1 EX >_ E[XlgrAg --7] .Pr[SrAg ---7] >_ ~An , part (a) of the theorem follows. Proof of Theorem 3b. We now turn to part (b) of the theorem, when r > 2. Again we consider an arbitrary decentralized algorithm; and again, as necessary, we may assume that n is larger than some fixed absolute constant no. We write e = r - 2. Consider a node u, and let v be a randomly generated long-range contact of v. The normalizing constant for the inverse rth-power distribution is at least 1, and so for any m, we have 2n-2 Pr [d(u,v) > m] p. Let g~ be the event that in step i, the message reaches a node u ¢ t that has a long-range contact v satisfying d(u,v) > n ~. Let g'= U £~ be the event that this i ¼. Let X denote the random variable equal to the number of steps taken for the message to reach t, and let g denote the event that 169
the message reaches t within X'n'steps. We claim that [5 B. Bollobas, F R K Chung, The diameter of a cycle if F occurs and e' does not occur, then E cannot occur. plus a random matching, "SIAM J Discrete Math. 1 For if a' does not occur, then the message can move a 328(1988 lattice distance of at most nY in each of its first A'n [6]FRK. Chung, M.R. Garey, "Diameter bounds for steps. This is a total lattice distance of at most altered graphs, "J. Graph Theory 8, 511(1984 7]J. Guare, Sir Degrees of Separation: A Play(Vin Books, New York, 1990) and so the message will not reach t given that F occurs [8J. Hunter and R. Shotland, "Treating data collected Thus E[X|F∧副≥Xn. Since by the small world method as a Markov process Social Forces 52, 321(1974 EX≥EX|FAPr[A≥4Xn° [9] J. Kaiser, Ed, "It' s a small Web after all, " Science 285,1815(1999) part(b)of the theorem follows [10 P. Killworth and H. Bernard, "Reverse small world experiment, "Social Networks 1, 159(1978) [11M. Kochen, Ed, The Small World(Ablex, Nor wood,1989) Algorithmic work in different settings has considered [12] C. Korte and S. Milgram, "Acquaintance networks the problem of routing with local information; see for between racial groups: Application of the small world example the problem of designing compact routing ta- method "J. Personality and Social Psych, 15, 101 bles for communication networks [15] and the problem of 1978) robot navigation in an unknown environment 3]. Our [13] S. Milgram, The small world problem, "Psychol results are technically quite different from these; but ogy Today 1, 61(1967) they share the general goal of identifying qualitative [14]R. Motwani and P. Raghavan, Randomized Al- properties of networks that makes routing with local in- gorithms(Cambridge University Press, Cambridge about effective routing schemes in such networks. While [15] D. Peleg, E. Upfal, "A trade-off between size and we have deliberately focused on a very clean model, we efficiency for routing tables, " Journal of the ACM believe that a more general conclusion can be drawn for small-world networks: that the correlation between lo- [16]I de Sola Pool and M. Kochen, "Contacts and in cal structure and long-range connections provides fun fluence,"Social Networks 1, 5(1978) damental cues for finding paths through the 17] H. Kautz, B. Selman, M. Shah,"Referral Web When this correlation is near a critical threshold, the Combining Social Networks and Collaborative Fil structure of the long-range connections forms a type of tering, "Communications of the ACM, 30, 3(March 'gradient"that allows individuals to guide a message efficiently toward a target. As the correlation drops be- [18 J. Travers and S Milgram, "An experimental study low this critical value and the social network becomes of the small world problem, Sociometry 32, 425 more homogeneous, these cues begin to disappear; in the limit, when long- range connections are generated [19] D. Watts and S. Strogatz, "Collective dynamics of uniformly at random, our model describes a world in small-world networks, Nature 393, 440(1998) which short chains exist but individuals, faced with a [20] H. White, "Search parameters for the small world disorienting array of social contacts, are unable to find problem, "Social Forces 49, 259(1970) them Acknowledgements. We thank Steve Strogatz for many valuable discussions on this topic References [1 L Adamic,"The small world Web, Proceedings of the European Conf. on Digital Libraries, 1999 [2]R. Albert, H Jeong, A.L. Barabasi, "The diameter of the World Wide Web, "Nature 401, 130(1999) 3P. Berman, " On-line searching and navigation, "On Line algorithms: The State of the Art, A. Fiat and G. Woeginger, Eds, Springer, 1998 4] B. Bollobas, Random Graphs(Academic Press, Lon don,1985) 170
the message reaches t within )Cn z steps. We claim that if 5 r occurs and g' does not occur, then g cannot occur. For if $~ does not occur, then the message can move a lattice distance of at most n 7 in each of its first A'n ~ steps. This is a total lattice distance of at most )dn ~+'r = )dn Since EX >_ E[X 1.7"AC --/] .Pr[~AE -v] >_ ~;~n , part (b) of the theorem follows. 4 Conclusion Algorithmic work in different settings has considered the problem of routing with local information; see for example the problem of designing compact routing tables for communication networks [15] and the problem of robot navigation in an unknown environment [3]. Our results are technically quite different from these; but they share the general goal of identifying qualitative properties of networks that makes routing with local information tractable, and offering a model for reasoning about effective routing schemes in such networks. While we have deliberately focused on a very clean model, we believe that a more general conclusion can be drawn for small-world networks: that the correlation between local structure and long-range connections provides fundamental cues for finding paths through the network. When this correlation is near a critical threshold, the 'structure of the long-range connections forms a type of "gradient" that allows individuals to guide a message efficiently toward a target. As the correlation drops below this critical value and the social network becomes more homogeneous, these cues begin to disappear; in the limit, when long-range connections are generated uniformly at random, our model describes a world in which short chains exist but individuals, faced with a disorienting array of social contacts, are unable to find them. Acknowledgements. Wethank Steve Strogatz for many valuable discussions on this topic. References [1] L. Adamic, "The small world Web," Proceedings of the European Conf. on Digital Libraries, 1999. [2] R. Albert, H. Jeong, A.-L. Barabasi, "The diameter of the World Wide Web," Nature 401, 130 (1999). [3] P. Berman, "On-line searching and navigation," OnLine Algorithms: The State of the Art, A. Fiat and G. Woeginger, Eds., Springer, 1998. [4] B. Bollob£s, Random Graphs (Academic Press, London, 1985). [5] B. Bollob£s, F.R.K. Chung, "The diameter of a cycle plus a random matching," SIAM J. Discrete Math. 1, 328 (1988). [6] F.R.K. Chung, M.R. Garey, "Diameter bounds for altered graphs," J. Graph Theory 8, 511 (1984). [7] J. Guare, Six Degrees of Separation: A Play (Vintage Books, New York, 1990). [8] J. Hunter and R. Shotland, "Treating data collected by the small world method as a Markov process," Social Forces 52, 321 (1974). [9] J. Kaiser, Ed., "It's a small Web after all," Science 285, 1815 (1999). [10] P. Killworth and H. Bernard, "Reverse small world experiment," Social Networks 1,159 (1978). [11] M. Kochen, Ed., The Small World (Ablex, Norwood, 1989). [12] C. Korte and S. Milgram, "Acquaintance networks between racial groups: Application of the small world method," J. Personality and Social Psych., 15, 101 (1978). [13] S. Milgram, "The small world problem," Psychology Today 1, 61 (1967). [14] R. Motwani and P. Raghavan, Randomized Algorithms (Cambridge University Press, Cambridge, 1995). [15] D. Peleg, E. Upfal, "A trade-off between size and efficiency for routing tables," Journal of the ACM 36(1989). [16] I. de Sola Pool and M. Kochen, "Contacts and influence," Social Networks 1, 5 (1978). [17] H. Kautz, B. Selman, M. Shah, "ReferralWeb: Combining Social Networks and Collaborative Filtering," Communications of the ACM, 30, 3 (March 1997). [18] J. Travers and S. Milgram, "An experimental study of the small world problem," Sociometry 32, 425 (1969). [19] D. Watts and S. Strogatz, "Collective dynamics of small-world networks," Nature 393, 440 (1998). [20] H. White, "Search parameters for the small world problem," Social Forces 49, 259 (1970). 170