Downloaded from rsif.royalsocietypublishing.org on November 19,2012 JOURNAI -0) THE ROYAL SOCIETY Interface The complex network of global cargo ship movements Pablo Kaluza,Andrea Kolzsch,Michael T.Gastner and Bernd Blasius J.R.Soc.Interface published online 19 January 2010 doi:10.1098/rsif.2009.0495 Supplementary data "Data Supplement" http://rsif.royalsocietypublishing.org/content/suppl/2010/01/19/rsif.2009.0495.DC1.htm References This article cites 40 articles,13 of which can be accessed free http://rsif.royalsocietypublishing.org/content/early/2010/01/19/rsif.2009.0495.full.html# ref-list-1 Article cited in: http://rsif.royalsocietypublishing.org/content/early/2010/01/19/rsif.2009.0495.full.html#related-url P<P Published online 19 January 2010 in advance of the print journal. Subject collections Articles on similar topics can be found in the following collections biocomplexity(46 articles) biogeography(8 articles) compufational biology(162 articles) Email alerting service Receive free email alerts when new articles cite this article-sign up in the box at the top right-hand corner of the article or click here Advance online articles have been peer reviewed and accepted for publication but have not yet appeared in the paper journal(edited,typeset versions may be posted when available prior to final publication).Advance online articles are citable and establish publication priority;they are indexed by PubMed from initial publication. Citations to Advance online articles must include the digital object identifier(DOIs)and date of initial publication. To subscribe to J.R.Soc.Interface go to:http://rsif.royalsocietypublishing.org/subscriptions
doi: 10.1098/rsif.2009.0495 J. R. Soc. Interface published online 19 January 2010 Pablo Kaluza, Andrea Kölzsch, Michael T. Gastner and Bernd Blasius The complex network of global cargo ship movements Supplementary data l http://rsif.royalsocietypublishing.org/content/suppl/2010/01/19/rsif.2009.0495.DC1.htm "Data Supplement" References s http://rsif.royalsocietypublishing.org/content/early/2010/01/19/rsif.2009.0495.full.html#related-url Article cited in: ref-list-1 http://rsif.royalsocietypublishing.org/content/early/2010/01/19/rsif.2009.0495.full.html# This article cites 40 articles, 13 of which can be accessed free P<P Published online 19 January 2010 in advance of the print journal. Subject collections computational biology (162 articles) biogeography (8 articles) biocomplexity (46 articles) Articles on similar topics can be found in the following collections Email alerting service right-hand corner of the article or click here Receive free email alerts when new articles cite this article - sign up in the box at the top publication. Citations to Advance online articles must include the digital object identifier (DOIs) and date of initial online articles are citable and establish publication priority; they are indexed by PubMed from initial publication. the paper journal (edited, typeset versions may be posted when available prior to final publication). Advance Advance online articles have been peer reviewed and accepted for publication but have not yet appeared in To subscribe to J. R. Soc. Interface go to: http://rsif.royalsocietypublishing.org/subscriptions Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 JOURNAL THE ROYAL FirstCite J.R.Soc.Interface nterface doi10.1098/sif.2009.0495 e-publishing Published online The complex network of global cargo ship movements Pablo Kaluza,Andrea Kolzsch,Michael T.Gastner and Bernd Blasius* Institute for Chemistry and Biology of the Marine Environment,Carl von Ossietzky Universitit,Carl-von-Ossietzky-Strafe 9-11,26111 Oldenburg,Germany Transportation networks play a crucial role in human mobility,the exchange of goods and the spread of invasive species.With 90 per cent of world trade carried by sea,the global network of merchant ships provides one of the most important modes of transportation. Here,we use information about the itineraries of 16 363 cargo ships during the year 2007 to construct a network of links between ports.We show that the network has several fea- tures that set it apart from other transportation networks.In particular,most ships can be classified into three categories:bulk dry carriers,container ships and oil tankers.These three categories do not only differ in the ships'physical characteristics,but also in their mobility patterns and networks.Container ships follow regularly repeating paths whereas bulk dry carriers and oil tankers move less predictably between ports.The network of all ship movements possesses a heavy-tailed distribution for the connectivity of ports and for the loads transported on the links with systematic differences between ship types The data analysed in this paper improve current assumptions based on gravity models of ship movements,an important step towards understanding patterns of global trade and bioinvasion. Keywords:complex network;cargo ships;bioinvasion;transportation 1.INTRODUCTION human livelihoods,health and local economies (Mack The ability to travel,trade commodities and share et al.2000).The financial loss owing to bioinvasion is information around the world with unprecedented effi- estimated to be $120 billion per year in the USA alone (Pimentel et al.2005). ciency is a defining feature of the modern globalized economy.Among the different means of transport, Despite affecting everybody's daily lives,the ship- ocean shipping stands out as the most energy efficient ping industry is far less in the public eye than other sectors of the global transport infrastructure.Accord- mode of long-distance transport for large quantities of goods(Rodrigue et al.2006).According to estimates, ingly,it has also received little attention in the recent as much as 90 per cent of world trade is hauled by literature on complex networks (Wei et al.2007;Hu ships (International Maritime Organization 2006).In Zhu 2009).This neglect is surprising considering the 2006,7.4 billion tons of goods were loaded at the current interest in networks (Albert Barabasi 2002: world's ports.The trade volume currently exceeds 30 Newman 2003a;Gross Blasius 2008),especially air- trillion ton-miles and is growing at a rate faster than port (Barrat et al.2004;Guimera Amaral 2004; the global economy (United Nations Conference on Hufnagel et al.2004;Guimera et al.2005),road (Buhl Trade and Development 2007). et al.2006;Barthelemy Flammini 2008)and train The worldwide maritime network also plays a crucial networks (Latora Marchiori 2002;Sen et al.2003). role in today's spread of invasive species.Two major In the spirit of current network research.we take here pathways for marine bioinvasion are discharged water a large-scale perspective on the global cargo ship from ships'ballast tanks (Ruiz et al.2000)and hull network (GCSN)as a complex system defined as the fouling (Drake Lodge 2007).Even terrestrial species network of ports that are connected by links if ship traffic passes between them. such as insects are sometimes inadvertently transported in shipping containers(Lounibos 2002).In several parts Similar research in the past had to make strong of the world,invasive species have caused dramatic assumptions about flows on hypothetical networks levels of species extinction and landscape alteration, with connections between all pairs of ports in order to thus damaging ecosystems and creating hazards for approximate ship movements (Drake Lodge 2004; Tatem et al.2006).By contrast,our analysis is based on comprehensive data of real ship journeys allowing *Author for correspondence (blasiusaicbm.de). us to construct the actual network.We show that it Electronic supplementary material is available at http://dx.doi.org/ has a small-world topology where the combined cargo 10.1098/rsif.2009.0495 or via http://rsif.royalsocietypublishing.org. capacity of ships calling at a given port (measured in Received 11 November 2009 Accepted 21 December 2009 This journal is 2010 The Royal Society
The complex network of global cargo ship movements Pablo Kaluza, Andrea Ko¨lzsch, Michael T. Gastner and Bernd Blasius* Institute for Chemistry and Biology of the Marine Environment, Carl von Ossietzky Universita¨t, Carl-von-Ossietzky-Straße 9-11, 26111 Oldenburg, Germany Transportation networks play a crucial role in human mobility, the exchange of goods and the spread of invasive species. With 90 per cent of world trade carried by sea, the global network of merchant ships provides one of the most important modes of transportation. Here, we use information about the itineraries of 16 363 cargo ships during the year 2007 to construct a network of links between ports. We show that the network has several features that set it apart from other transportation networks. In particular, most ships can be classified into three categories: bulk dry carriers, container ships and oil tankers. These three categories do not only differ in the ships’ physical characteristics, but also in their mobility patterns and networks. Container ships follow regularly repeating paths whereas bulk dry carriers and oil tankers move less predictably between ports. The network of all ship movements possesses a heavy-tailed distribution for the connectivity of ports and for the loads transported on the links with systematic differences between ship types. The data analysed in this paper improve current assumptions based on gravity models of ship movements, an important step towards understanding patterns of global trade and bioinvasion. Keywords: complex network; cargo ships; bioinvasion; transportation 1. INTRODUCTION The ability to travel, trade commodities and share information around the world with unprecedented effi- ciency is a defining feature of the modern globalized economy. Among the different means of transport, ocean shipping stands out as the most energy efficient mode of long-distance transport for large quantities of goods (Rodrigue et al. 2006). According to estimates, as much as 90 per cent of world trade is hauled by ships (International Maritime Organization 2006). In 2006, 7.4 billion tons of goods were loaded at the world’s ports. The trade volume currently exceeds 30 trillion ton-miles and is growing at a rate faster than the global economy (United Nations Conference on Trade and Development 2007). The worldwide maritime network also plays a crucial role in today’s spread of invasive species. Two major pathways for marine bioinvasion are discharged water from ships’ ballast tanks (Ruiz et al. 2000) and hull fouling (Drake & Lodge 2007). Even terrestrial species such as insects are sometimes inadvertently transported in shipping containers (Lounibos 2002). In several parts of the world, invasive species have caused dramatic levels of species extinction and landscape alteration, thus damaging ecosystems and creating hazards for human livelihoods, health and local economies (Mack et al. 2000). The financial loss owing to bioinvasion is estimated to be $120 billion per year in the USA alone (Pimentel et al. 2005). Despite affecting everybody’s daily lives, the shipping industry is far less in the public eye than other sectors of the global transport infrastructure. Accordingly, it has also received little attention in the recent literature on complex networks (Wei et al. 2007; Hu & Zhu 2009). This neglect is surprising considering the current interest in networks (Albert & Baraba´si 2002; Newman 2003a; Gross & Blasius 2008), especially airport (Barrat et al. 2004; Guimera` & Amaral 2004; Hufnagel et al. 2004; Guimera` et al. 2005), road (Buhl et al. 2006; Barthe´lemy & Flammini 2008) and train networks (Latora & Marchiori 2002; Sen et al. 2003). In the spirit of current network research, we take here a large-scale perspective on the global cargo ship network (GCSN) as a complex system defined as the network of ports that are connected by links if ship traffic passes between them. Similar research in the past had to make strong assumptions about flows on hypothetical networks with connections between all pairs of ports in order to approximate ship movements (Drake & Lodge 2004; Tatem et al. 2006). By contrast, our analysis is based on comprehensive data of real ship journeys allowing us to construct the actual network. We show that it has a small-world topology where the combined cargo capacity of ships calling at a given port (measured in *Author for correspondence (blasius@icbm.de). Electronic supplementary material is available at http://dx.doi.org/ 10.1098/rsif.2009.0495 or via http://rsif.royalsocietypublishing.org. J. R. Soc. Interface doi:10.1098/rsif.2009.0495 Published online Received 11 November 2009 Accepted 21 December 2009 1 This journal is q 2010 The Royal Society Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 2 Complex network of ship movements P.Kaluza et al. a b >4 2 the 20 most central ports 1 Panama Canal 11 Santos 2 Suez Canal 12 Tianjin 3 Shanghai 13 New York and New Jersey 4 Singapore 14 Europoort 5 Antwerp 15 Hamburg 6 Piraeus 16 Le Havre 111 111 journeys 7 Terneuzen 17 St Petersburg 8 Plaquemines 18 Bremerhaven 5000 9 Houston 19 Las Palmas 10 Ijmuiden 20 Barcelona Figure 1.Routes,ports and betweenness centralities in the GCSN.(a)The trajectories of all cargo ships bigger than 10000 GT during 2007.The colour scale indicates the number of journeys along each route.Ships are assumed to travel along the shortest (geodesic)paths on water.(b)A map of the 50 ports of highest betweenness centrality and a ranked list of the 20 most central ports. gross tonnage (GT))follows a heavy-tailed distribution. to the port authorities.This technology is primarily This capacity scales superlinearly with the number of used to avoid collisions and increase port security,but directly connected ports.We identify the most central arrival and departure records are also made available ports in the network and find several groups of highly by Lloyd's Register Fairplay for commercial purposes interconnected ports showing the importance of as part of its Sea-web database (www.sea-web.com). regional geopolitical and trading blocks. AIS devices have not been installed in all ships and A high-level description of the complete network. ports yet,and therefore there are some gaps in the however,does not yet fully capture the network's com- data.Still,all major ports and the largest ships are plexity. Unlike previously studied transportation included,thus the database represents the majority of networks,the GCSN has a multi-layered structure. cargo transported on ships. There are,broadly speaking,three classes of cargo Our study is based on Sea-web's arrival and ships-container ships,bulk dry carriers and oil departure records in the calendar year 2007 as well as tankers-that span distinct subnetworks.Ships in Sea-web's comprehensive data on the ships'physical different categories tend to call at different ports and characteristics.We restrict our study to cargo ships travel in distinct patterns.We analyse the trajectories bigger than 10000 GT that make up 93 per cent of of individual ships in the GCSN and develop techniques the world's total capacity for cargo ship transport. to extract quantitative information about characteristic From these,we select all 16 363 ships for which AIS movement types.With these methods,we can quantify data are available,taken as representative of the that container ships sail along more predictable,fre global traffic and long distance trade between the 951 quently repeating routes than oil tankers or bulk dry ports equipped with AIS receivers (for details see elec- carriers.We compare the empirical data with theoreti- tronic supplementary material).For each ship,we cal traffic fows calculated by the gravity model obtain a trajectory from the database,i.e.a list of Simulation results,based on the full GCSN data or ports visited by the ship sorted by date.In 2007, the gravity model,differ significantly in a population- there were 490 517 non-stop journeys linking 36 351 dis- dynamic model for the spread of invasive species tinct pairs of arrival and departure ports.The complete between the world's ports.Predictions based on the set of trajectories,each path representing the shortest real network are thus more informative for international route at sea and coloured by the number of journeys policy decisions concerning the stability of worldwide passing through it,is shown in figure 1a. trade and for reducing the risks of bioinvasion Each trajectory can be interpreted as a small directed network where the nodes are ports linked together if the ship travelled directly between the 2.DATA ports.Larger networks can be defined by merging tra- jectories of different ships.In this article,we An analysis of global ship movements requires detailed aggregate trajectories in four different ways:the com- knowledge of ships'arrival and departure times at their bined network of all available trajectories and the ports of call.Such data have become available in recent subnetworks of container ships (3100 ships),bulk dry years.Starting in 2001,ships and ports have begun carriers(5498)and oil tankers(2628).These three sub- installing the automatic identification system (AIS) networks combinedly cover 74 per cent of the GCSN's equipment.AIS transmitters on board of the ships total GT.In all four networks,we assign a weight w automatically report the arrival and departure times to the link from port i to j equal to the sum of the J.R.Soc.Interface
gross tonnage (GT)) follows a heavy-tailed distribution. This capacity scales superlinearly with the number of directly connected ports. We identify the most central ports in the network and find several groups of highly interconnected ports showing the importance of regional geopolitical and trading blocks. A high-level description of the complete network, however, does not yet fully capture the network’s complexity. Unlike previously studied transportation networks, the GCSN has a multi-layered structure. There are, broadly speaking, three classes of cargo ships—container ships, bulk dry carriers and oil tankers—that span distinct subnetworks. Ships in different categories tend to call at different ports and travel in distinct patterns. We analyse the trajectories of individual ships in the GCSN and develop techniques to extract quantitative information about characteristic movement types. With these methods, we can quantify that container ships sail along more predictable, frequently repeating routes than oil tankers or bulk dry carriers. We compare the empirical data with theoretical traffic flows calculated by the gravity model. Simulation results, based on the full GCSN data or the gravity model, differ significantly in a populationdynamic model for the spread of invasive species between the world’s ports. Predictions based on the real network are thus more informative for international policy decisions concerning the stability of worldwide trade and for reducing the risks of bioinvasion. 2. DATA An analysis of global ship movements requires detailed knowledge of ships’ arrival and departure times at their ports of call. Such data have become available in recent years. Starting in 2001, ships and ports have begun installing the automatic identification system (AIS) equipment. AIS transmitters on board of the ships automatically report the arrival and departure times to the port authorities. This technology is primarily used to avoid collisions and increase port security, but arrival and departure records are also made available by Lloyd’s Register Fairplay for commercial purposes as part of its Sea-web database (www.sea-web.com). AIS devices have not been installed in all ships and ports yet, and therefore there are some gaps in the data. Still, all major ports and the largest ships are included, thus the database represents the majority of cargo transported on ships. Our study is based on Sea-web’s arrival and departure records in the calendar year 2007 as well as Sea-web’s comprehensive data on the ships’ physical characteristics. We restrict our study to cargo ships bigger than 10 000 GT that make up 93 per cent of the world’s total capacity for cargo ship transport. From these, we select all 16 363 ships for which AIS data are available, taken as representative of the global traffic and long distance trade between the 951 ports equipped with AIS receivers (for details see electronic supplementary material). For each ship, we obtain a trajectory from the database, i.e. a list of ports visited by the ship sorted by date. In 2007, there were 490 517 non-stop journeys linking 36 351 distinct pairs of arrival and departure ports. The complete set of trajectories, each path representing the shortest route at sea and coloured by the number of journeys passing through it, is shown in figure 1a. Each trajectory can be interpreted as a small directed network where the nodes are ports linked together if the ship travelled directly between the ports. Larger networks can be defined by merging trajectories of different ships. In this article, we aggregate trajectories in four different ways: the combined network of all available trajectories and the subnetworks of container ships (3100 ships), bulk dry carriers (5498) and oil tankers (2628). These three subnetworks combinedly cover 74 per cent of the GCSN’s total GT. In all four networks, we assign a weight wij to the link from port i to j equal to the sum of the 5000 journeys 0 1 2 3 >4 betweenness (×104) the 20 most central ports 1 Panama Canal 2 Suez Canal 3 Shanghai 4 Singapore 5 Antwerp 6 Piraeus 7 Terneuzen 8 Plaquemines 9 Houston 10 Ijmuiden 11 Santos 12 Tianjin 13 New York and New Jersey 14 Europoort 15 Hamburg 16 Le Havre 17 St Petersburg 18 Bremerhaven 19 Las Palmas 20 Barcelona (a) (b) Figure 1. Routes, ports and betweenness centralities in the GCSN. (a) The trajectories of all cargo ships bigger than 10 000 GT during 2007. The colour scale indicates the number of journeys along each route. Ships are assumed to travel along the shortest (geodesic) paths on water. (b) A map of the 50 ports of highest betweenness centrality and a ranked list of the 20 most central ports. 2 Complex network of ship movements P. Kaluza et al. J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 Compler network of ship movements P.Kaluza et al.3 available space on all ships that have travelled on the The degree distribution P()shows that most ports link during 2007 measured in GT.If a ship made a jour- have few connections,but there are some ports linked ney from i to j more than once,its capacity contributes to hundreds of other ports (figure 2a).Similar right- multiple times to wir skewed degree distributions have been observed in many real-world networks (Barabasi Albert 1999). While the GCSN's degree distribution is not exactly 3.THE GLOBAL CARGO SHIP NETWORK scale-free,the distribution of link weights,P(w),fol- lows approximately a power law P(w)oc w with The directed network of the entire cargo fleet is notice- u=1.71+0.14 (95%CI for linear regression, ably asymmetric,with 59 per cent of all linked pairs of figure 26,see also electronic supplementary material). ports being connected only in one direction.Still.the By averaging the sums of the link weights arriving vast majority of ports (935 out of 951)belongs to one at and departing from port i,we obtain the node single strongly connected component,i.e.for any two strength si(Barrat et al.2004).The strength distri- ports in this component,there are routes in both direc- bution can also be approximated by a power law tions,though possibly visiting different intermediate P(s)cs”with)=1.02±0.17,meaning that a ports.The routes are intriguingly short:only few steps small number of ports handle huge amounts of cargo in the network are needed to get from one port to (figure 2c).The determination of power law relation- another.The shortest path length l between two ports ships by line fitting has been strongly criticized (e.g. is the minimum number of non-stop connections one Newman 2005;Clauset et al.2009),therefore,we must take to travel between origin and destination.In analysed the distributions with model selection by the GCSN,the average over all pairs of ports is extre- Akaike weights (Burnham Anderson 1998).Our mely small,(=2.5.Even the maximum shortest results confirm that a power law is a better fit than path between any two ports (e.g.from Skagway, an exponential or a lognormal distribution for P(w) Alaska,to the small Italian island of Lampedusa)is and P(s),but not P(k)(see electronic supplementary only of length ax=8.In fact,the majority of all poss- material).These findings agree well with the concept ible origin-destination pairs (52%)can already be of hubs-spokes networks (Notteboom 2004)that connected by two steps or less. were proposed for cargo traffic,for example,in Asia Comparing these findings to those reported for the (Robinson 1998).There are a few large,highly con- worldwide airport network (WAN)shows interesting nected ports through which all smaller ports transact differences and similarities.The high asymmetry of their trade.This scale-free property makes the ship the GCSN has not been found in the WAN,indicating trade network prone to the spreading and persistence that ship traffic is structurally very different from avia- of bioinvasive organisms (e.g.Pastor-Satorras tion.Rather than being formed by the accumulation of Vespignani 2001).The average nearest-neighbour back-and-forth trips,ship traffic seems to be governed degree,a measure of network assortativity,addition- by an optimal arrangement of unidirectional,often cir- ally underlines the hubs-spokes property of cargo cular routes.This optimality also shows in the GCSN's ship traffic (see electronic supplementary material). small shortest path lengths.In comparison,in the Strengths and degrees of the ports are related accord- WAN,the average and maximum shortest path lengths ing to the scaling relation (s()6+01(95%CI for are (=4.4 and ax=15,respectively (Guimera et al. standardized major axis regression;Warton et al.2006). 2005),i.e.about twice as long as in the GCSN.Similar Hence,the strength of a port grows generally faster to the WAN,the GCSN is highly clustered:if a port Xis than its degree(figure 2d).In other words,highly con- linked to ports Yand Z,there is a high probability that nected ports not only have many links,but their links there is also a connection from Yto 2.We calculated a also have a higher than average weight.This obser- clustering coefficient C (Watts Strogatz 1998)for vation agrees with the fact that busy ports are better directed networks and found C=0.49,whereas equipped to handle large ships with large amounts of random networks with the same number of nodes and cargo.A similar result,(s())k501,was found for links only yield C=0.04 on average.Degree-dependent airports(Barrat et al.2004),which may hint at a gen- clustering coefficients C reveal that clustering eral pattern in transportation networks.In the light of decreases with node degree(see electronic supplemen- bioinvasion,these results underline empirical findings tary material).Therefore,the GCSN-like the that big ports are more heavily invaded because of WAN-can be regarded as a small-world network pos- increased propagule pressure by ballast water of more sessing short path lengths despite substantial and larger ships (Williamson 1996;Mack et al.2000; clustering (Watts Strogatz 1998).However,the aver- see Cohen Carlton 1998). age degree of the GCSN,i.e.the average number of links A further indication of the importance of a node is its arriving at and departing from a given port (in-degree betweenness centrality (Freeman 1979;Newman 2004) plus out-degree),(=76.5,is notably higher than in The betweenness of a port is the number of topologi- the WAN,where ()=19.4 (Barrat et al.2004).In cally shortest directed paths in the network that pass the light of the network size (the WAN consists of through this port.In figure 16,we plot and list the 3880 nodes),this difference becomes even more pro- most central ports.Generally speaking,centrality and nounced,indicating that the GCSN is much more degree are strongly correlated (Pearson's correlation densely connected.This redundancy of links gives the coefficient:0.81).but in individual cases other factors network high structural robustness to the loss of can also play a role.The Panama and Suez canals,for routes for keeping up trade instance,are shortcuts to avoid long passages around J.R.Soc.Interface
available space on all ships that have travelled on the link during 2007 measured in GT. If a ship made a journey from i to j more than once, its capacity contributes multiple times to wij. 3. THE GLOBAL CARGO SHIP NETWORK The directed network of the entire cargo fleet is noticeably asymmetric, with 59 per cent of all linked pairs of ports being connected only in one direction. Still, the vast majority of ports (935 out of 951) belongs to one single strongly connected component, i.e. for any two ports in this component, there are routes in both directions, though possibly visiting different intermediate ports. The routes are intriguingly short: only few steps in the network are needed to get from one port to another. The shortest path length l between two ports is the minimum number of non-stop connections one must take to travel between origin and destination. In the GCSN, the average over all pairs of ports is extremely small, kll ¼ 2:5. Even the maximum shortest path between any two ports (e.g. from Skagway, Alaska, to the small Italian island of Lampedusa) is only of length lmax ¼ 8. In fact, the majority of all possible origin –destination pairs (52%) can already be connected by two steps or less. Comparing these findings to those reported for the worldwide airport network (WAN) shows interesting differences and similarities. The high asymmetry of the GCSN has not been found in the WAN, indicating that ship traffic is structurally very different from aviation. Rather than being formed by the accumulation of back-and-forth trips, ship traffic seems to be governed by an optimal arrangement of unidirectional, often circular routes. This optimality also shows in the GCSN’s small shortest path lengths. In comparison, in the WAN, the average and maximum shortest path lengths are kll ¼ 4:4 and lmax ¼ 15, respectively (Guimera` et al. 2005), i.e. about twice as long as in the GCSN. Similar to the WAN, the GCSN is highly clustered: if a port X is linked to ports Y and Z, there is a high probability that there is also a connection from Y to Z. We calculated a clustering coefficient C (Watts & Strogatz 1998) for directed networks and found C ¼ 0.49, whereas random networks with the same number of nodes and links only yield C ¼ 0.04 on average. Degree-dependent clustering coefficients Ck reveal that clustering decreases with node degree (see electronic supplementary material). Therefore, the GCSN—like the WAN—can be regarded as a small-world network possessing short path lengths despite substantial clustering (Watts & Strogatz 1998). However, the average degree of the GCSN, i.e. the average number of links arriving at and departing from a given port (in-degree plus out-degree), kkl ¼ 76.5, is notably higher than in the WAN, where kkl ¼ 19.4 (Barrat et al. 2004). In the light of the network size (the WAN consists of 3880 nodes), this difference becomes even more pronounced, indicating that the GCSN is much more densely connected. This redundancy of links gives the network high structural robustness to the loss of routes for keeping up trade. The degree distribution P(k) shows that most ports have few connections, but there are some ports linked to hundreds of other ports (figure 2a). Similar rightskewed degree distributions have been observed in many real-world networks (Baraba´si & Albert 1999). While the GCSN’s degree distribution is not exactly scale-free, the distribution of link weights, P(w), follows approximately a power law PðwÞ / wm with m ¼ 1:71 + 0:14 (95% CI for linear regression, figure 2b, see also electronic supplementary material). By averaging the sums of the link weights arriving at and departing from port i, we obtain the node strength si (Barrat et al. 2004). The strength distribution can also be approximated by a power law PðsÞ / sh with h ¼ 1:02 + 0:17, meaning that a small number of ports handle huge amounts of cargo (figure 2c). The determination of power law relationships by line fitting has been strongly criticized (e.g. Newman 2005; Clauset et al. 2009), therefore, we analysed the distributions with model selection by Akaike weights (Burnham & Anderson 1998). Our results confirm that a power law is a better fit than an exponential or a lognormal distribution for P(w) and P(s), but not P(k) (see electronic supplementary material). These findings agree well with the concept of hubs – spokes networks (Notteboom 2004) that were proposed for cargo traffic, for example, in Asia (Robinson 1998). There are a few large, highly connected ports through which all smaller ports transact their trade. This scale-free property makes the ship trade network prone to the spreading and persistence of bioinvasive organisms (e.g. Pastor-Satorras & Vespignani 2001). The average nearest-neighbour degree, a measure of network assortativity, additionally underlines the hubs –spokes property of cargo ship traffic (see electronic supplementary material). Strengths and degrees of the ports are related according to the scaling relation ksðkÞl / k1:46+0:1 (95% CI for standardized major axis regression; Warton et al. 2006). Hence, the strength of a port grows generally faster than its degree (figure 2d). In other words, highly connected ports not only have many links, but their links also have a higher than average weight. This observation agrees with the fact that busy ports are better equipped to handle large ships with large amounts of cargo. A similar result, ksðkÞl / k1:5+0:1, was found for airports (Barrat et al. 2004), which may hint at a general pattern in transportation networks. In the light of bioinvasion, these results underline empirical findings that big ports are more heavily invaded because of increased propagule pressure by ballast water of more and larger ships (Williamson 1996; Mack et al. 2000; see Cohen & Carlton 1998). A further indication of the importance of a node is its betweenness centrality (Freeman 1979; Newman 2004). The betweenness of a port is the number of topologically shortest directed paths in the network that pass through this port. In figure 1b, we plot and list the most central ports. Generally speaking, centrality and degree are strongly correlated (Pearson’s correlation coefficient: 0.81), but in individual cases other factors can also play a role. The Panama and Suez canals, for instance, are shortcuts to avoid long passages around Complex network of ship movements P. Kaluza et al. 3 J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 4 Complex network of ship movements P.Kaluza et al. (a)101 (b) 10-3 102 10-6 10-7 米 103 10r1 (M)d 10-8 10r2 109 10- 104 1010 10- 4 10 103 1011 10310510610 10 102 103 104 105 105 107 108 degree,k link weight.w (c) 105 (d) 米 106 108 107 107 80 米 108 02 10 106 10 109 10—8 10-9 105 10-10 1010 104105105107108 104 104 105 106 107 108 1 10 100 1000 node strength,s degree,k Figure 2.Degrees and weights in the GCSN.(a)The degree distributions P(k)are right-skewed,but not power laws,neither for the GCSN nor its subnetworks.The degree k is defined here as the sum of in-and out-degree,thus k=1 is rather rare (asterisk,all ship types;square,container ships;circle,bulk dry carriers;triangle,oil tankers).()The link weight distributions P(w)reveal clear power law relationships for the GCSN and the three subnetworks,with exponents u characteristic for the movement pat- terns of the different ship types(asterisk,.u=1.71±0.14;square,.μ=1.42±0.l5:circle,μ=1.93±0.1l;triangle,.μ=1.73± 0.25).(c)The node strength distributions P(s)are also heavy tailed,showing power law relationships.The stated exponents are calculated by linear regression with 95%confidence intervals(similar results are obtained with maximum likelihood estimates,see electronic supplementary material)(asterisk,n=1.02+0.17;square,n=1.05+0.13;circle,n=1.13+0.21;triangle,n= 1.010.16).(d)The average strength of a node(s()scales superlinearly with its degree,(s(,indicating that highly connected ports have,on average,links of higher weight. South America and Africa.Other ports have a high cen- has a higher mean degree and fewer journeys per link trality because they are visited by a large number of (C=0.43,()=44.61.(J)=4.65).For the oil tankers ships (e.g.Shanghai),whereas others gain their status we find intermediate values (C=0.44,()=33.32. primarily by being connected to many different ports (J)=5.07).Note that the mean degrees (of the sub- (e.g.Antwerp). networks are substantially smaller than that of the full GCSN,indicating that different ship types use essentially the same ports but different connections. 4.THE NETWORK LAYERS OF DIFFERENT A similar tendency appears in the scaling of the link SHIP TYPES weight distributions (figure 26).P(w)can be approxi- mated as power laws for each network,but with To compare the movements of cargo ships of different different exponents u.The container ships have the types,separate networks were generated for each of smallest exponent (u=1.42)and bulk dry carriers the the three main ship types:container ships,bulk dry car- largest (u=1.93)with oil tankers in between (u= riers and oil tankers.Applying the network parameters 1.73).In contrast,the exponents for the distribution of introduced in the previous section to these three subnet- node strength P(s)are nearly identical in all three sub- works reveals some broad-scale differences (table 1) networks,n=1.05,n=1.13 and n=1.01,respectively. The network of container ships is densely clustered These numbers give a first indication that different C=0.52,has a rather low mean degree,()=32.44, ship types move in distinctive patterns.Container and a large mean number of journeys (i.e.number of ships typically follow set schedules visiting several times any ship passes)per link,()=24.26.The bulk ports in a fixed sequence along their way,thus provid- dry carrier network,on the other hand,is less clustered, ing regular services.Bulk dry carriers,by contrast, J.R.Soc.Interface
South America and Africa. Other ports have a high centrality because they are visited by a large number of ships (e.g. Shanghai), whereas others gain their status primarily by being connected to many different ports (e.g. Antwerp). 4. THE NETWORK LAYERS OF DIFFERENT SHIP TYPES To compare the movements of cargo ships of different types, separate networks were generated for each of the three main ship types: container ships, bulk dry carriers and oil tankers. Applying the network parameters introduced in the previous section to these three subnetworks reveals some broad-scale differences (table 1). The network of container ships is densely clustered, C ¼ 0.52, has a rather low mean degree, kkl ¼ 32:44, and a large mean number of journeys (i.e. number of times any ship passes) per link, kJl ¼ 24:26. The bulk dry carrier network, on the other hand, is less clustered, has a higher mean degree and fewer journeys per link (C ¼ 0:43, kkl ¼ 44:61, kJl ¼ 4:65). For the oil tankers, we find intermediate values (C ¼ 0:44, kkl ¼ 33:32, kJl ¼ 5:07). Note that the mean degrees kkl of the subnetworks are substantially smaller than that of the full GCSN, indicating that different ship types use essentially the same ports but different connections. A similar tendency appears in the scaling of the link weight distributions (figure 2b). P(w) can be approximated as power laws for each network, but with different exponents m. The container ships have the smallest exponent (m ¼ 1.42) and bulk dry carriers the largest (m ¼ 1.93) with oil tankers in between (m ¼ 1.73). In contrast, the exponents for the distribution of node strength P(s) are nearly identical in all three subnetworks, h ¼ 1.05, h ¼ 1.13 and h ¼ 1.01, respectively. These numbers give a first indication that different ship types move in distinctive patterns. Container ships typically follow set schedules visiting several ports in a fixed sequence along their way, thus providing regular services. Bulk dry carriers, by contrast, 10–1 (a) (b) (c) (d) 10–2 10–3 P(k) 10–4 104 105 106 107 108 10–4 10–3 10–2 10–1 1 10 102 degree, k link weight, w 102 104 10–4 10–5 10–6 10–7 10–8 10–9 10–10 10–11 10–5 10–6 10–7 10–8 10–9 10–10 105 106 107 104 105 106 107 108 1 10 103 107 106 10 104 5 1 100 1000 10 node strength, s degree, k 108 104 10–11 10–10 10–9 P(w) 10–8 10–7 10–6 10–5 10–10 10–9 10 P –8 (s) 10–7 10–6 10–5 105 106 107 108 s(k) Figure 2. Degrees and weights in the GCSN. (a) The degree distributions P(k) are right-skewed, but not power laws, neither for the GCSN nor its subnetworks. The degree k is defined here as the sum of in- and out-degree, thus k ¼ 1 is rather rare (asterisk, all ship types; square, container ships; circle, bulk dry carriers; triangle, oil tankers). (b) The link weight distributions P(w) reveal clear power law relationships for the GCSN and the three subnetworks, with exponents m characteristic for the movement patterns of the different ship types (asterisk, m ¼ 1.71+0.14; square, m ¼ 1.42+0.15; circle, m ¼ 1.93+0.11; triangle, m ¼ 1.73+ 0.25). (c) The node strength distributions P(s) are also heavy tailed, showing power law relationships. The stated exponents are calculated by linear regression with 95% confidence intervals (similar results are obtained with maximum likelihood estimates, see electronic supplementary material) (asterisk, h ¼ 1.02+0.17; square, h ¼ 1.05+0.13; circle, h ¼ 1.13+0.21; triangle, h ¼ 1.01+0.16). (d) The average strength of a node ks(k)l scales superlinearly with its degree, ks(k)l / k1.46+0.1, indicating that highly connected ports have, on average, links of higher weight. 4 Complex network of ship movements P. Kaluza et al. J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 Compler network of ship movements P.Kaluza et al.5 Table 1.Characterization of different subnetworks.Number of ships,total GT (10 GT)and number of ports n in each subnetwork;together with network characteristics:mean degree ()clustering coefficient C.mean shortest path length (1). mean journeys per link ()power law exponents u and n:and trajectory properties:average number of distinct ports(M, links(L),port calls (S)per ship and regularity index (p).Some notable values are highlighted in bold. ship class ships MGT n ( () (J) (N) (L (S) () whole fleet 16363 664.7 951 76.40.49 2.5 13.57 1.71 1.02 10.4 15.6 31.8 0.63 container ships 3100 116.8 378 32.4 0.52 2.76 24.25 1.42 1.05 11.2 21.2 48.9 1.84 bulk dry carriers 5498 196.8 616 44.60.43 2.57 4.65 1.93 1.13 8.9 10.4 12.2 0.03 oil tankers 2628 178.4 505 33.3 0.44 2.74 5.07 1.73 1.01 9.2 12.9 17.7 0.19 appear less predictable as they frequently change their products so that oil-producing regions do not appear routes on short notice depending on the current as separate communities.This may be due to the supply and demand of the goods they carry.The limit in the detectability of smaller communities by larger variety of origins and destinations in the bulk modularity optimization (Fortunato Barthelemy dry carrier network (n=616 ports,compared with 2007).but does not affect the relevance of the revealed n=378 for container ships)explains the higher average ship traffic communities.Because of the,by definition. degree and the smaller number of journeys for a given higher transport intensity within communities,bioinva- link.Oil tankers also follow short-term market trends, sive spread is expected to be heavier between ports of but,because they can only load oil and oil products, the same community.However,in figure 3.it becomes the number of possible destinations (n=505)is more clear that there are no strict geographical barriers limited than for bulk dry carriers. between communities.Thus,spread between commu- These differences are also underlined by the between- nities is very likely to occur even on small spatial ness centralities of the three network layers (see scales by shipping or ocean currents between close-by electronic supplementary material).While some ports ports that belong to different communities. rank highly in all categories (e.g.Suez Canal, Despite the differences between the three main cargo Shanghai),others are specialized on certain ship fleets,there is one unifying feature:their motif distri- types.For example,the German port of Wilhelmshaven bution (Milo et al.2002).Like most previous studies ranks tenth in terms of its worldwide betweenness for oil we focus here on the occurrence of three-node motifs tankers,but is only 241st for bulk dry carriers and and present their normalized 2 score,a measure for 324th for container ships. their abundance in a network (figure 4).Strikingly, We can gain further insight into the roles of the ports the three fleets have practically the same motif distri- by examining their community structure.Communities bution.In fact.the 2 scores closely resemble those are groups of ports with many links within the groups found in the World Wide Web and different social net- but few links between different groups.We calculated works that were conjectured to form a superfamily of these communities for the three subnetworks with a networks (Milo et al.2004).This superfamily displays modularity optimization method for directed networks many transitive triplet interactions (i.e.if X-Y and (Leicht Newman 2008)and found that they differ Y-Z,then X-Z);for example,the overrepresented significantly from modularities of corresponding motif 13 in figure 4,has six such interactions.Intransi- Erdos-Renyi graphs (figure 3;Guimera et al.2004). tive motifs,like motif 6,are comparably infrequent.The The network of container trade shows 12 communities abundance of transitive interactions in the ship net- (figure 3a).The largest ones are located (i)on the works indicates that cargo can be transported both Arabian,Asian and South African coasts,(ii)on the directly between ports as well as via several intermedi- North American east coast and in the Caribbean,(iii) ate ports.Thus,the high clustering and redundancy of in the Mediterranean,the Black Sea and on the Euro- links(robustness to link failures)appear not only in the pean west coast,(iv)in Northern Europe and (v)in GCSN but also in the three subnetworks.The similarity the Far East and on the American west coast.The of the motif distributions to other humanly optimized transport of bulk dry goods reveals seven groups networks underlines that cargo trade,like social net- (figure 3b).Some can be interpreted as geographical works and the World Wide Web,depends crucially on entities (e.g.North American east coast,trans-Pacific human interactions and information exchange.While trade),while others are dispersed on multiple continents. advantageous for the robustness of trade,the clustering Especially interesting is the community structure of the of links as triplets also has an unwanted side effect:in oil transportation network that shows six groups general,the more clustered a network,the more vulner- (figure 3c):(i)the European,north and west African able it becomes to the global spread of alien species, market,(ii)a large community comprising Asia, even for low invasion probabilities(Newman 20036). South Africa and Australia,(iii)three groups for the Atlantic market with trade between Venezuela,the Gulf of Mexico,the American east coast and Northern 5.NETWORK TRAJECTORIES Europe and (iv)the American Pacific coast.It should be noted that the network includes the transport of Going beyond the network perspective,the database crude oil as well as commerce with already refined oil also provides information about the movement J.R.Soc.Interface
appear less predictable as they frequently change their routes on short notice depending on the current supply and demand of the goods they carry. The larger variety of origins and destinations in the bulk dry carrier network (n ¼ 616 ports, compared with n ¼ 378 for container ships) explains the higher average degree and the smaller number of journeys for a given link. Oil tankers also follow short-term market trends, but, because they can only load oil and oil products, the number of possible destinations (n ¼ 505) is more limited than for bulk dry carriers. These differences are also underlined by the betweenness centralities of the three network layers (see electronic supplementary material). While some ports rank highly in all categories (e.g. Suez Canal, Shanghai), others are specialized on certain ship types. For example, the German port of Wilhelmshaven ranks tenth in terms of its worldwide betweenness for oil tankers, but is only 241st for bulk dry carriers and 324th for container ships. We can gain further insight into the roles of the ports by examining their community structure. Communities are groups of ports with many links within the groups but few links between different groups. We calculated these communities for the three subnetworks with a modularity optimization method for directed networks (Leicht & Newman 2008) and found that they differ significantly from modularities of corresponding Erdo¨s –Renyi graphs (figure 3; Guimera` et al. 2004). The network of container trade shows 12 communities (figure 3a). The largest ones are located (i) on the Arabian, Asian and South African coasts, (ii) on the North American east coast and in the Caribbean, (iii) in the Mediterranean, the Black Sea and on the European west coast, (iv) in Northern Europe and (v) in the Far East and on the American west coast. The transport of bulk dry goods reveals seven groups (figure 3b). Some can be interpreted as geographical entities (e.g. North American east coast, trans-Pacific trade), while others are dispersed on multiple continents. Especially interesting is the community structure of the oil transportation network that shows six groups (figure 3c): (i) the European, north and west African market, (ii) a large community comprising Asia, South Africa and Australia, (iii) three groups for the Atlantic market with trade between Venezuela, the Gulf of Mexico, the American east coast and Northern Europe and (iv) the American Pacific coast. It should be noted that the network includes the transport of crude oil as well as commerce with already refined oil products so that oil-producing regions do not appear as separate communities. This may be due to the limit in the detectability of smaller communities by modularity optimization (Fortunato & Barthelemy 2007), but does not affect the relevance of the revealed ship traffic communities. Because of the, by definition, higher transport intensity within communities, bioinvasive spread is expected to be heavier between ports of the same community. However, in figure 3, it becomes clear that there are no strict geographical barriers between communities. Thus, spread between communities is very likely to occur even on small spatial scales by shipping or ocean currents between close-by ports that belong to different communities. Despite the differences between the three main cargo fleets, there is one unifying feature: their motif distribution (Milo et al. 2002). Like most previous studies, we focus here on the occurrence of three-node motifs and present their normalized Z score, a measure for their abundance in a network (figure 4). Strikingly, the three fleets have practically the same motif distribution. In fact, the Z scores closely resemble those found in the World Wide Web and different social networks that were conjectured to form a superfamily of networks (Milo et al. 2004). This superfamily displays many transitive triplet interactions (i.e. if X ! Y and Y ! Z, then X ! Z); for example, the overrepresented motif 13 in figure 4, has six such interactions. Intransitive motifs, like motif 6, are comparably infrequent. The abundance of transitive interactions in the ship networks indicates that cargo can be transported both directly between ports as well as via several intermediate ports. Thus, the high clustering and redundancy of links (robustness to link failures) appear not only in the GCSN but also in the three subnetworks. The similarity of the motif distributions to other humanly optimized networks underlines that cargo trade, like social networks and the World Wide Web, depends crucially on human interactions and information exchange. While advantageous for the robustness of trade, the clustering of links as triplets also has an unwanted side effect: in general, the more clustered a network, the more vulnerable it becomes to the global spread of alien species, even for low invasion probabilities (Newman 2003b). 5. NETWORK TRAJECTORIES Going beyond the network perspective, the database also provides information about the movement Table 1. Characterization of different subnetworks. Number of ships, total GT (106 GT) and number of ports n in each subnetwork; together with network characteristics: mean degree kkl, clustering coefficient C, mean shortest path length kl l, mean journeys per link kJ l, power law exponents m and h; and trajectory properties: average number of distinct ports kNl, links kLl, port calls kS l per ship and regularity index kpl. Some notable values are highlighted in bold. ship class ships MGT n kkl C kl l kJ l m h kN l kLl kS l kpl whole fleet 16 363 664.7 951 76.4 0.49 2.5 13.57 1.71 1.02 10.4 15.6 31.8 0.63 container ships 3100 116.8 378 32.4 0.52 2.76 24.25 1.42 1.05 11.2 21.2 48.9 1.84 bulk dry carriers 5498 196.8 616 44.6 0.43 2.57 4.65 1.93 1.13 8.9 10.4 12.2 0.03 oil tankers 2628 178.4 505 33.3 0.44 2.74 5.07 1.73 1.01 9.2 12.9 17.7 0.19 Complex network of ship movements P. Kaluza et al. 5 J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 6 Compler network of ship movements P.Kaluza et al. (a) ⑥ (c) Figure 3.Communities of ports in three cargo ship subnetworks.The communities are groups of ports that maximize the number of links within the groups,as opposed to between the groups,in terms of the modularity Q(Leicht Newman 2008).In each map,the colours represent the c distinct trading communities for the goods transported by (a)container ships(c=12,Q= 0.605),(b)bulk dry carriers (c=7,Q=0.592)and (c)oil tankers (c=6,Q=0.716).All modularities Q of the examined net- works differ significantly from modularities in Erdos-Renyi graphs of the same size and number of links(Guimera et al.2004). For the networks corresponding to (a),(b)and (c),values are QER=0.219,QER=0.182 and QER=0.220,respectively. characteristics per individual ship (table 1).The aver- (1.9 days on average in our data)in the ports for age number of distinct ports per ship (N does not cargo operations.By contrast,bulk dry carriers and differ much between different ship classes,but con- oil tankers move more slowly (between 13 and 17 tainer ships call much more frequently at ports than knots)and stay longer in the ports (on average 5.6 bulk dry carriers and oil tankers.This difference is days for bulk dry carriers and 4.6 days for oil explained by the characteristics and operational tankers). mode of these ships.Normally,container ships are The speed at sea and of cargo handling,however,is fast (between 20 and 25 knots)and spend less time not the only operational difference.The topology of J.R.Soc.Interface
characteristics per individual ship (table 1). The average number of distinct ports per ship kNl does not differ much between different ship classes, but container ships call much more frequently at ports than bulk dry carriers and oil tankers. This difference is explained by the characteristics and operational mode of these ships. Normally, container ships are fast (between 20 and 25 knots) and spend less time (1.9 days on average in our data) in the ports for cargo operations. By contrast, bulk dry carriers and oil tankers move more slowly (between 13 and 17 knots) and stay longer in the ports (on average 5.6 days for bulk dry carriers and 4.6 days for oil tankers). The speed at sea and of cargo handling, however, is not the only operational difference. The topology of (a) (b) (c) Figure 3. Communities of ports in three cargo ship subnetworks. The communities are groups of ports that maximize the number of links within the groups, as opposed to between the groups, in terms of the modularity Q (Leicht & Newman 2008). In each map, the colours represent the c distinct trading communities for the goods transported by (a) container ships (c ¼ 12, Q ¼ 0.605), (b) bulk dry carriers (c ¼ 7, Q ¼ 0.592) and (c) oil tankers (c ¼ 6, Q ¼ 0.716). All modularities Q of the examined networks differ significantly from modularities in Erdo¨s –Renyi graphs of the same size and number of links (Guimera` et al. 2004). For the networks corresponding to (a), (b) and (c), values are QER ¼ 0.219, QER ¼ 0.182 and QER ¼ 0.220, respectively. 6 Complex network of ship movements P. Kaluza et al. J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 Compler network of ship movements P.Kaluza et al.7 0.75 0.50 0.25 0 -0.25 -0.50 -0.75 ∧VYV八六△7Ay△Y在 subgraph Figure 4.Motif distributions of the three main cargo fleets.A positive (negative)normalized Zscore indicates that a motif is more (less)frequent in the real network than in random networks with the same degree sequence.For comparison,we overlay the Z scores of the World Wide Web and social networks.The agreement suggests that the ship networks fall in the same superfamily of networks (Milo et al.2004).The motif distributions of the fleets are maintained even when 25,50 and 75%of the weakest con- nections are removed.Line with open circles,www-1;line with open squares,www-2:line with open diamonds,www-3:line with triangles,social-1:line with crosses,social-2;line with asterisks,social-3:line with red circles,bulk dry carriers:line with blue squares,container ships;line with black diamonds,oil tankers. the trajectories also differs substantially.Characteristic Until recently,surveys of seaborne trade had to rely sample trajectories for each ship type are presented in on far fewer data:only the total number of arrivals at figure 5a-c.The container ship (figure 5a)travels on some major ports was publicly accessible,but not the some of the links several times during the study ships'actual paths(Zachcial Heideloff 2001).Missing period,whereas the bulk dry carrier (figure 5b)passes information about the frequency of journeys,thus,had almost every link exactly once.The oil tanker to be replaced by plausible assumptions,the gravity (figure 5c)commutes a few times between some ports, model being the most popular choice.It posits that but by and large also serves most links only once. trips are,in general,more likely between nearby ports We can express these trends in terms of a'regularity than between ports far apart.If dij is the distance index'p that quantifies how much the frequency with between ports i and j,the decline in mutual interaction which each link is used deviates from a random network. is expressed in terms of a distance deterrence function Consider the trajectory of a ship calling Stimes at Ndis f(dii).The number of journeys from i to i then takes tinct ports and travelling on L distinct links.We the form Fi=aibjOilf(d),where O;is the total compare the mean number of journeys per link number of departures from port i and l;the number freal =S/L to the average link usage fran in an ensemble of arrivals at j(Haynes Fotheringham 1984).The of randomized trajectories with the same number of coefficients a;and bj are needed to ensure Fj=Oi nodes N and port calls S.To quantify the difference and∑,F=. between real and random trajectories,we calculate the How well can the gravity model approximate real Zscore p=(freal-fram)/o(where ois the standard devi- ship traffic?We choose a truncated power law for the ation of f in the random ensemble).If p=0,the real deterrence function,f(di)=dPexp(-dij/K).The trajectory is indistinguishable from a random walk. strongest correlation between model and data is whereas larger values of p indicate that the movement obtained for B=0.59 and K=4900 km (see electronic is more regular.Figure 5d-f present the distributions supplementary material).At first sight,the agreement of the regularity index p for the different fleets.For con- between data and model appears indeed impressive. tainer ships,p is distributed broadly around p2,thus The predicted distribution of travelled distances supporting our earlier observation that most container (figure 6a)fits the data far better than a simpler non- ships provide regular services between ports along their spatial model that preserves the total number of way.Trajectories of bulk dry carriers and oil tankers. journeys,but assumes completely random origins and on the other hand,appear essentially random with the destinations. vast majority of ships near p =0. A closer look at the gravity model,however,reveals its limitations.In figure 6b,we count how often links 6.APPROXIMATING TRAFFIC FLOWS with an observed number of journeys Nij are predicted to be passed Fi times.Ideally,all data points would USING THE GRAVITY MODEL align along the diagonal F=Nin but we find that In this article,we view global ship movements as a net- the data are substantially scattered.Although the par- work based on detailed arrival and departure records ameters B and k were chosen to minimize the scatter. J.R.Soc.Interface
the trajectories also differs substantially. Characteristic sample trajectories for each ship type are presented in figure 5a–c. The container ship (figure 5a) travels on some of the links several times during the study period, whereas the bulk dry carrier (figure 5b) passes almost every link exactly once. The oil tanker (figure 5c) commutes a few times between some ports, but by and large also serves most links only once. We can express these trends in terms of a ‘regularity index’ p that quantifies how much the frequency with which each link is used deviates from a random network. Consider the trajectory of a ship calling S times at N distinct ports and travelling on L distinct links. We compare the mean number of journeys per link freal ¼ S=L to the average link usage fran in an ensemble of randomized trajectories with the same number of nodes N and port calls S. To quantify the difference between real and random trajectories, we calculate the Z score p ¼ ðfreal franÞ=s (where s is the standard deviation of f in the random ensemble). If p ¼ 0, the real trajectory is indistinguishable from a random walk, whereas larger values of p indicate that the movement is more regular. Figure 5d–f present the distributions of the regularity index p for the different fleets. For container ships, p is distributed broadly around p 2, thus supporting our earlier observation that most container ships provide regular services between ports along their way. Trajectories of bulk dry carriers and oil tankers, on the other hand, appear essentially random with the vast majority of ships near p ¼ 0. 6. APPROXIMATING TRAFFIC FLOWS USING THE GRAVITY MODEL In this article, we view global ship movements as a network based on detailed arrival and departure records. Until recently, surveys of seaborne trade had to rely on far fewer data: only the total number of arrivals at some major ports was publicly accessible, but not the ships’ actual paths (Zachcial & Heideloff 2001). Missing information about the frequency of journeys, thus, had to be replaced by plausible assumptions, the gravity model being the most popular choice. It posits that trips are, in general, more likely between nearby ports than between ports far apart. If dij is the distance between ports i and j, the decline in mutual interaction is expressed in terms of a distance deterrence function f(dij). The number of journeys from i to j then takes the form Fij ¼ aibjOiIjfðdijÞ, where Oi is the total number of departures from port i and Ij the number of arrivals at j (Haynes & Fotheringham 1984). The coefficients ai and bj are needed to ensure P j Fij ¼ Oi and P i Fij ¼ Ij. How well can the gravity model approximate real ship traffic? We choose a truncated power law for the deterrence function, fðdijÞ ¼ db ij expðdij=kÞ. The strongest correlation between model and data is obtained for b ¼ 0:59 and k ¼ 4900 km (see electronic supplementary material). At first sight, the agreement between data and model appears indeed impressive. The predicted distribution of travelled distances (figure 6a) fits the data far better than a simpler nonspatial model that preserves the total number of journeys, but assumes completely random origins and destinations. A closer look at the gravity model, however, reveals its limitations. In figure 6b, we count how often links with an observed number of journeys Nij are predicted to be passed Fij times. Ideally, all data points would align along the diagonal Fij ¼ Nij, but we find that the data are substantially scattered. Although the parameters b and k were chosen to minimize the scatter, 0.75 0.50 0.25 0 –0.25 normalized Z score –0.50 –0.75 1234567 subgraph 8 9 10 11 12 13 Figure 4. Motif distributions of the three main cargo fleets. A positive (negative) normalized Z score indicates that a motif is more (less) frequent in the real network than in random networks with the same degree sequence. For comparison, we overlay the Z scores of the World Wide Web and social networks. The agreement suggests that the ship networks fall in the same superfamily of networks (Milo et al. 2004). The motif distributions of the fleets are maintained even when 25, 50 and 75% of the weakest connections are removed. Line with open circles, www-1; line with open squares, www-2; line with open diamonds, www-3; line with triangles, social-1; line with crosses, social-2; line with asterisks, social-3; line with red circles, bulk dry carriers; line with blue squares, container ships; line with black diamonds, oil tankers. Complex network of ship movements P. Kaluza et al. 7 J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 8 Compler network of ship movements P.Kaluza et al. (a) (c) Emden Jebel Dhanna/Ruwais Terneuzen Buenos Aires 2 Antwerp Mina Al Ahmadi Montevideo Santos Jubail 2 2 Jebel Ali Hamburg 3 Rotterdam Botlek 2 Ras Laffan Singapore Vlaardingen Port Rashid Cuxhaven Santos (b) Kalamaki Port Jerome 2 Algiers Rouen Piraeus Yuzhny Ghazaouet (El Dekheila (d03 container ships (e) bulk dry carriers oil tankers 0 2 4 6 8 0 2 4 68 0 2 468 Figure 5.Sample trajectories of(a)a container ship with a regularity index p=2.09,(b)a bulk dry carrier,p=0.098 and (c)an oil tanker,p=1.027.In the three trajectories,the numbers and the line thickness indicate the frequency of journeys on each link. (d-f)Distribution of p for the three main fleets. the correlation between data and model is only moder- network as a metapopulation process where the popu- ate (Kendall's T=0.433).In some cases,the prediction lation dynamics on the nodes are coupled by is off by several thousand journeys per year. transport on the links.In our model,ships can transport Recent studies have used the gravity model to pin- a surviving population of an invasive species with only a point the ports and routes central to the spread of small probability prans1%on each journey between invasive species (Drake Lodge 2004;Tatem et al. two successively visited ports.The transported popu- 2006).The model's shortcomings pose the question as lation is only a tiny fraction s of the population at the to how reliable such predictions are.For this purpose, port of origin.Immediately after arriving at a new we investigated a dynamic model of ship-mediated port,the species experiences strong demographic fluctu- bioinvasion where the weights of the links are either ations that lead,in most cases,to the death of the the observed traffic flows or the flows of the gravity imported population.If,however,the new immigrants model. beat the odds of this 'ecological roulette'(Carlton We follow previous epidemiological studies(Rvachev Geller 1993)and establish,the population P grows Longini 1985;Flahault et al.1988;Hufnagel et al.2004; rapidly following the stochastic logistic equation Colizza et al.2006)in viewing the spread on the dP/dt rP(1-P)+vP(t)with growth rate r=1 J.R.Soc.Interface
the correlation between data and model is only moderate (Kendall’s t ¼ 0.433). In some cases, the prediction is off by several thousand journeys per year. Recent studies have used the gravity model to pinpoint the ports and routes central to the spread of invasive species (Drake & Lodge 2004; Tatem et al. 2006). The model’s shortcomings pose the question as to how reliable such predictions are. For this purpose, we investigated a dynamic model of ship-mediated bioinvasion where the weights of the links are either the observed traffic flows or the flows of the gravity model. We follow previous epidemiological studies (Rvachev & Longini 1985; Flahault et al. 1988; Hufnagel et al. 2004; Colizza et al. 2006) in viewing the spread on the network as a metapopulation process where the population dynamics on the nodes are coupled by transport on the links. In our model, ships can transport a surviving population of an invasive species with only a small probability ptrans ¼ 1% on each journey between two successively visited ports. The transported population is only a tiny fraction s of the population at the port of origin. Immediately after arriving at a new port, the species experiences strong demographic fluctuations that lead, in most cases, to the death of the imported population. If, however, the new immigrants beat the odds of this ‘ecological roulette’ (Carlton & Geller 1993) and establish, the population P grows rapidly following the stochastic logistic equation dP=dt ¼ rPð1 PÞ þ ffiffiffiffi P p jðtÞ with growth rate r ¼ 1 Mina Al Ahmadi 4 4 4 4 1 1 Terneuzen Buenos Aires Montevideo Jebel Ali Jubail Port Rashid Santos Kalamaki Yuzhny Piraeus Algiers Rouen Ghazaouet Port Jerome El Dekheila Ras Laffan Santos Antwerp Hamburg Rotterdam Botlek Vlaardingen Cuxhaven Emden (a) (c) (b) 2 1 4 3 2 1 1 1 1 1 1 1 3 4 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 2 2 Jebel Dhanna/Ruwais Singapore (d) ( 3 container ships e) (f ) 2 1 D(p) 0 024 p 6 8 bulk dry carriers 024 p 6 8 oil tankers 024 p 6 8 Figure 5. Sample trajectories of (a) a container ship with a regularity index p ¼ 2.09, (b) a bulk dry carrier, p ¼ 0.098 and (c) an oil tanker, p ¼ 1.027. In the three trajectories, the numbers and the line thickness indicate the frequency of journeys on each link. (d – f ) Distribution of p for the three main fleets. 8 Complex network of ship movements P. Kaluza et al. J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012
Downloaded from rsif.royalsocietypublishing.org on November 19,2012 Compler network of ship movements P.Kaluza et al.9 number of (a) (b) links 105 103 104 10 103 102 103 102 102 三 10 10 0 5000 100001500020000 0 101001000 distance (km) observed journeys,N Figure 6.(a)Histogram of port-to-port distances travelled in the GCSN (navigable distances around continents as indicated in figure 1).We overlay the predictions of two different models.The gravity model (red),based on information about distances between ports and total port calls,gives a much better fit than a simpler model (blue)which only fixes the total number of jour- neys (black line,observed;red line,gravity model;blue line,random traffic).(b)Count of port pairs with N;observed and F predicted journeys.The flows F were calculated with the gravity model (rounded to the nearest integer).Some of the worst out- liers are highlighted in blue.Circle,Antwerp to Calais (N;=0 versus F;=200);triangle,Hook of Holland to Europoort (16 versus 1895);diamond,Calais to Dover (4392 versus 443);square,Harwich to Hook of Holland (644 versus 0). per year and Gaussian white noise For details of the 400c(a model,see the electronic supplementary material. 300 Starting from a single port at carrying capacity P=1, we model contacts between ports as Poisson processes 200 with rates N(empirical data)or F(gravity model). 100 As shown in figure 7a,the gravity model systematically overestimates the spreading rate,and the difference can 0 become particularly pronounced for ports that are well (b) connected,but not among the central hubs in the net- 400 work (figure 7b).Comparing typical sequences of 200 invaded ports,we find that the invasions driven by the real traffic flows tend to be initially confined to 0 smaller regional ports,whereas in the gravity model the invasions quickly reach the hubs.The total out- 0 10 20 30 40 50 time (years) and in-flows at the ship journeys'origin and departure ports,respectively,are indeed more strongly positively Figure 7.Results from a stochastic population model for the correlated in reality than in the model (T=0.157 spread of an invasive species between ports.(a)The invasion versus 0.047).The gravity model thus erases too starts from one single,randomly chosen port.(b)The initial many details of a hierarchical structure present in the port is fixed as Bergen (Norway),an example of a well- real network.That the gravity model eliminates most connected port (degree k=49)which is not among the central correlations is also plausible from simple analytic argu- hubs.The rate of journeys from port i to j per year is assumed ments (see electronic supplementary material for to be N;(real flows from the GCSN)or F(gravity model) Each journey has a small probability of transporting a tiny frac- details).The absence of strong correlations makes the tion of the population from origin to destination.Parameters gravity model a suitable null hypothesis if the corre- were adjusted (r=1 per year,pumns=0.01 and s=4x 10-5) lations in the real network are unknown,but several to yield a per-ship-call probability of initiating invasion of recent studies have shown that correlations play an approximately 4.4 x 10-4(Drake Lodge 2004;see electronic important role in spreading processes on networks supplementary material for details).Plotted are the cumulative (e.g.Boguna Pastor-Satorras 2002;Newman 2002). numbers of invaded ports(population number larger than half Hence,if the correlations are known,they should not the carrying capacity)averaged over (a)14000 and (b)1000 be ignored. simulation runs (standard error equal to line thickness) While we observed that the spreading rates for the Black line,real traffic flows:grey line,gravity model. AIS data were consistently slower than for the gravity model even when different parameters or population that the per-ship-call probability of initiating invasion models were considered,the time scale of the invasion is approximately 4.4 x 10-4,a rule-of-thumb value is much less predictable.The assumption that only a stated by Drake Lodge (2004).Still,too little is small fraction of invaders succeed outside their native empirically known to pin down individual parameters habitat appears realistic (Mack et al.2000).Further- with sufficient accuracy to give more than a qualitative more,the parameters in our model were adjusted so impression.It is especially difficult to predict how a J.R.Soc.Interface
per year and Gaussian white noise j. For details of the model, see the electronic supplementary material. Starting from a single port at carrying capacity P ¼ 1, we model contacts between ports as Poisson processes with rates Nij (empirical data) or Fij (gravity model). As shown in figure 7a, the gravity model systematically overestimates the spreading rate, and the difference can become particularly pronounced for ports that are well connected, but not among the central hubs in the network (figure 7b). Comparing typical sequences of invaded ports, we find that the invasions driven by the real traffic flows tend to be initially confined to smaller regional ports, whereas in the gravity model the invasions quickly reach the hubs. The total outand in-flows at the ship journeys’ origin and departure ports, respectively, are indeed more strongly positively correlated in reality than in the model (t ¼ 0.157 versus 0.047). The gravity model thus erases too many details of a hierarchical structure present in the real network. That the gravity model eliminates most correlations is also plausible from simple analytic arguments (see electronic supplementary material for details). The absence of strong correlations makes the gravity model a suitable null hypothesis if the correlations in the real network are unknown, but several recent studies have shown that correlations play an important role in spreading processes on networks (e.g. Bogun˜a & Pastor-Satorras 2002; Newman 2002). Hence, if the correlations are known, they should not be ignored. While we observed that the spreading rates for the AIS data were consistently slower than for the gravity model even when different parameters or population models were considered, the time scale of the invasion is much less predictable. The assumption that only a small fraction of invaders succeed outside their native habitat appears realistic (Mack et al. 2000). Furthermore, the parameters in our model were adjusted so that the per-ship-call probability of initiating invasion is approximately 4.4 1024 , a rule-of-thumb value stated by Drake & Lodge (2004). Still, too little is empirically known to pin down individual parameters with sufficient accuracy to give more than a qualitative impression. It is especially difficult to predict how a 0 5000 10000 15000 20000 distance (km) 1 10 102 103 104 105 (a) (b) number of journeys observed journeys, Nij 0 10 100 1000 predicted journeys, Fij 0 1 10 102 103 1 10 102 103 104 number of links Figure 6. (a) Histogram of port-to-port distances travelled in the GCSN (navigable distances around continents as indicated in figure 1). We overlay the predictions of two different models. The gravity model (red), based on information about distances between ports and total port calls, gives a much better fit than a simpler model (blue) which only fixes the total number of journeys (black line, observed; red line, gravity model; blue line, random traffic). (b) Count of port pairs with Nij observed and Fij predicted journeys. The flows Fij were calculated with the gravity model (rounded to the nearest integer). Some of the worst outliers are highlighted in blue. Circle, Antwerp to Calais (Nij ¼ 0 versus Fij ¼ 200); triangle, Hook of Holland to Europoort (16 versus 1895); diamond, Calais to Dover (4392 versus 443); square, Harwich to Hook of Holland (644 versus 0). 0 100 200 300 400 0 10 20 30 40 50 time (years) 0 200 400 mean number of invaded ports (a) (b) Figure 7. Results from a stochastic population model for the spread of an invasive species between ports. (a) The invasion starts from one single, randomly chosen port. (b) The initial port is fixed as Bergen (Norway), an example of a wellconnected port (degree k ¼ 49) which is not among the central hubs. The rate of journeys from port i to j per year is assumed to be Nij (real flows from the GCSN) or Fij (gravity model). Each journey has a small probability of transporting a tiny fraction of the population from origin to destination. Parameters were adjusted (r ¼ 1 per year, ptrans ¼ 0.01 and s ¼ 4 1025 ) to yield a per-ship-call probability of initiating invasion of approximately 4.4 1024 (Drake & Lodge 2004; see electronic supplementary material for details). Plotted are the cumulative numbers of invaded ports (population number larger than half the carrying capacity) averaged over (a) 14 000 and (b) 1000 simulation runs (standard error equal to line thickness). Black line, real traffic flows; grey line, gravity model. Complex network of ship movements P. Kaluza et al. 9 J. R. Soc. Interface Downloaded from rsif.royalsocietypublishing.org on November 19, 2012