Competition between global and local online social networks

The overwhelming success of online social networks, the key actors in the Web 2.0 cosmos, has reshaped human interactions globally. To help understand the fundamental mechanisms which determine the fate of online social networks at the system level, we describe the digital world as a complex ecosystem of interacting networks. In this paper, we study the impact of heterogeneity in network fitnesses on the competition between an international network, such as Facebook, and local services. The higher fitness of international networks is induced by their ability to attract users from all over the world, which can then establish social interactions without the limitations of local networks. In other words, inter-country social ties lead to increased fitness of the international network. To study the competition between an international network and local ones, we construct a 1:1000 scale model of the digital world, consisting of the 80 countries with the most Internet users. Under certain conditions, this leads to the extinction of local networks; whereas under different conditions, local networks can persist and even dominate completely. In particular, our model suggests that, with the parameters that best reproduce the empirical overtake of Facebook, this overtake could have not taken place with a significant probability.

Empirical observations have shown that Facebook expanded massively in the middle of the first decade of this century, starting in the US, when local networks were the most popular services in most countries. Only a few years later, Facebook had become the most popular network in most countries. So, is the fate of the digital world to become dominated by a single "big brother" as it takes over all our digital interactions? Alternatively, is digital diversity possible from a system-level perspective? In this paper, we show that due to the nonlinear character of our model, the answer to both questions can be positive or negative depending on a range of parameters and, quite surprisingly, depending on chance. As we will show, our model, despite its simplicity and the limited number of parameters, is able to describe surprisingly well the complex behavior of globally interacting online social networks.

Results
Complex organization of the digital world. The digital world consists of highly connected and strongly coupled interacting subsystems. These basic building blocks are single networks, each of which obeys specific dynamics in the absence of coupling to the whole system. So the complexity of the digital world is a consequence of both the dynamics of networks in isolated environments and the interactions between many such networks. Finally, not all of these building blocks are identical. Instead, different networks address different peer groups or have different functionalities. Hence, to reveal the fundamental mechanisms that determine the fate of the digital world, it is necessary to understand the interaction of heterogeneous networks, each driven by intrinsic dynamics.

Isolated dynamics of online social networks.
The key actors in the digital world are OSNs; loosely defined as web-based platforms that enable digital social interactions over the Internet. However, societies were organized as networks long before OSNs were even thought of. From this point of view, the growth of OSNs can be described through the dynamical processes by which people in the traditional offline social structure come to engage in OSNs. The topology of the OSN is now the digital counterpart of the underlying offline social network 9,10 .
In isolation, this process of formation can be described by a set of simple dynamical mechanisms 9 . The system is initially given by an empty OSN and the underlying social structure. Individuals can be in three different states: active, passive or susceptible. While active and passive nodes exist in the online as well as the offline networks, susceptible nodes are only present in the latter. A susceptible node can join the OSN via two different mechanisms: a viral activation effect which means that a susceptible node becomes active due to the influence of an active neighbor in the offline network and a mass media effect, which represents the spontaneous activation of a susceptible node. In addition, an active node can become passive spontaneously (deactivation) and a passive node can become active again due to the influence of an active neighbor (viral reactivation). See Fig. 1d for a visualization of these mechanisms. Notice that, in the long time limit, when the number of susceptible individuals is basically zero, this dynamics is equivalent to the susceptible-infected-susceptible model widely used in epidemiology 11 . As found in 9,12 , this implies that online social networks can either exhibit a sustained activity (similar to the endemic phase), or they can become entirely passive (similar to the healthy phase).
The evolution of OSNs rarely takes place in isolation. Nevertheless, we found a perfect case study in the Slovakian social network "Pokec", which, due to the particularities of the country, has been growing in quasi-isolation for more than ten years. By analyzing the evolution of the topology of the social contact graph of "Pokec", we were able to rigorously validate our model. Quite remarkably, with only two parameters, the model reproduced the entire topological evolution with astonishing precision.
Viral activation and viral reactivation occur at the same rate, λ, and the ratio between this rate and the rate of mass media influence, μ, governs the topological evolution. In particular, we observed that the real system underwent a dynamical percolation transition; that is, a phase transition between a disconnected phase and a phase in which a macroscopic fraction of the system is connected. The position of this transition is controlled by λ/μ, due to the complementary roles that the viral activation and mass media effects play in the topological evolution of the network (the former tends to connect components; whereas the latter tends to create new components). Finally, without loss of generality, we set the deactivation rate δ to 1 which is equivalent of fixing the timescale of the model.
To sum up, in our previous work 9 , we were able to rigorously validate the dynamics ruling OSNs in isolation; the fundamental building blocks of the digital world. These findings constitute the foundation for the development of a more comprehensive theory of interacting heterogeneous networks.
Competitive interaction between multiple networks. The simultaneous existence of multiple digital services in competition for the attention of users suggests an ecological perspective from which to explain the prevalence of a given network or the coexistence of multiple networks. In ecology theory, the principle of competitive exclusion 4 states that multiple species in competition for the same resource cannot coexist, as even the slightest advantage of one species over the other is successively amplified; a mechanism referred to as "rich get richer" or preferential attachment [13][14][15][16][17][18][19][20][21] . This eventually leads to the extinction of the inferior species.
The key principle that drives the competition between OSNs is the fact that, due to the physical and cognitive limitations of users, the time they devote to online activities is limited. As a consequence, the viral parameter, λ, constitutes a conserved quantity that is nevertheless distributed between the competing networks as where ω i (ρ a ) represents a normalized set of weights, that is a 2 a is a vector denoting the fraction of active nodes (activities) in the different networks. In general, users are more likely to subscribe to and engage in networks that are more active. Therefore, the viral activity of each network must be a function of the activity of the network itself. In particular, we model this by assuming that the weighting ω i (ρ a ) is a function, such that . In 3 we proposed the particular form: were n l denotes the number of networks. This choice allows us to interpolate between a set of independent networks (σ ≪ 1) and highly coupled ones (σ ≫ 1). The activity affinity parameter, σ, then quantifies the tendency of users to subscribe to or engage in more active networks. Interestingly, in contrast to the principle of competitive exclusion, multiple networks can coexist because the "rich get richer" mechanism is damped by the diminishing returns of the dynamics of network evolution. For details, we refer the reader to ref. 3. In the following, we take into account the heterogeneity of networks induced by different groups of individuals that can subscribe to the different networks. These aspects are not discussed in ref. 3 and have important implications and applications, as we will show.
Network heterogeneity leads to effective activity. As mentioned above, since its official launch in 2004, Facebook has become the most popular OSN in most countries; even in countries where there was already a popular OSN before Facebook was launched. To mimic the real evolution of the digital ecosystem at the worldwide scale, we assume that one local network exists in each country in addition to a globally operating, international network (see Fig. 1a). In the US, both networks are launched at the same time; whereas the international network is launched with a delay Δ t in the remaining countries, to take into account the initial prevalence of local networks. Once launched, the international network provides the user with the possibility to connect to individuals in different countries, in contrast to local networks, making it more attractive to users. For a given country, the advantage of the international network is directly related to the abundance of social ties between that country and the rest of the world. We use passenger air travel data as a proxy for the abundance of such ties. This choice is justified by the strong correlation between air travel flows and further measures of inter-country exchange, for instance email communication 22 or Twitter activity 23 .
Users in country i experience the greater attractiveness of the international network as they perceive its activity with respect to the population of their own country and also with respect to their contacts in other countries. To account for this on a coarse grained level, in Eq. (1), we replace the activity of the international network by an effective activity as follows denotes the fraction of the number of air travel passengers between countries i and j, W ij ; and N i , the population of country i. Notice that, in an ecological context, this corresponds to increased fitness of the international network. In Eq. (2), we have implicitly assumed proportionality between the number of passengers and the number of contacts in the respective countries, namely N ij ∝ W ij . Finally, note that the arbitrary normalization in Eq. (3) serves the sole purpose of ensuring that reasonable values for the parameter α are of the order of unity.
Hereafter, we decompose the international network into a set of disjunct coupled subnetworks operating in each country and in competition with the respective local network (see Fig. 1b). These subnetworks are nevertheless not independent, as they are globally coupled via the effective activity defined in Eq. (2) and ultimately by the network representing the inter-country social ties. Hence, our model forms a network of networks [5][6][7][8] , where each node in Fig. 1c represents a three-layer multiplex network 24,25 in which the bottom layer corresponds to the underlying social structure and the two upper layers denote the local and international networks operating in the respective country (see Fig. 1d).
Double meanfield approximation reveals complex role of the activity affinity. To understand the qualitative behavior of the system, in this section we present a double meanfield approximation of the system. This reduces the system given by a network of networks to a set of evolution equations of the average activity in the international network and in local networks. As we show in the following section, the results of the full model with heterogeneous topologies exhibits similar behavior to that encountered by the double meanfield approximation.
The first meanfield approximation consists of assuming a fully mixed homogeneous population in each country. Let ρ i l , a denote the fraction of active users in network ∈ l (loc, int) in country i and ρ i l , s the fraction of nodes susceptible to joining this network. Then, the fraction of passive users is given by a . As explained above, in each country the virality is distributed between the local and international network via the weight func- a denotes the effective activity of the international network as defined in Eq. (2). The evolution equations of the resulting system represent a generalization of the evolution equations for identical networks which we derived in 3 , where one replaces the activity of the international network with the effective activity from Eq. (2). This procedure yields: As shown in 3,9 , we further assume the the same linear relationship between virality and media influence in each country, that is: As shown in 3 , the value of ν does not affect the stability of the system. In what follows, we perform the stability analysis in the limit ν → ∞ . This decouples the evolution of ρ i l , a from ρ i l , s , so that we only have to consider ρ i l , a . Plugging in the weights function defined in Eq. (1) and the effective activity from Eq. (2) yields the evolution equations for the activities of the local and international networks in country i: The second meanfield approximation consists of applying the hypothesis of a fully mixed homogeneous network for the inter-country social ties. We use α Ω = 〈Ω 〉 ij and define the mean activity of the local networks as ρ ≡ 〈 〉 x i,loc a and the mean activity of the international network as ρ ≡ 〈 〉 y i,int a . Finally, our double meanfield approximation leads to the following system of coupled differential equations Scientific RepoRts | 6:25116 | DOI: 10.1038/srep25116 which has three relevant parameters: λ〈 k〉 , σ, and Ω. Note that by setting Ω to zero, we recover the equations for identical networks presented in 3 .
In what follows, we discuss the dynamical properties of the system given by Eq. (7). For constant σ, the system exhibits a saddle-node bifurcation at a critical value of the global connectivity σ Ω ( ) c (see Fig. 2). Above this point, coexistence is not possible and the only stable solutions correspond to the domination of either local networks or the international one. Both above and below the critical value σ Ω ( ) c , the basin of attraction of the solution corre- sponding to the domination of local networks decreases with Ω, whereas that of the international network increases (see the rows of Fig. 2). Furthermore, at the critical point, the basin of attraction of the international network is amplified discontinuously as the region of coexistence in the subcritical regime is now merged with the basin of attraction of the domination of the international network.
For constant Ω > 0, the system also exhibits a saddlenode bifurcation at a critical value of the activity affinity σ Ω ( ) c . In 3 we showed that the system undergoes a subcritical pitchfork bifurcation with respect to the control parameter σ, above which no stable coexistence is possible. Ω > 0 breaks the symmetry of the pitchfork bifurcation and in this case the system undergoes a saddlenode bifurcation with respect to σ instead (see bottom of  Fig. 2). This is particularly interesting as it implies that an intermediate value of the activity affinity just slightly above the critical point  σ σ Ω ( ) c represents the worst scenario for the survival of local networks, since at this point the size of the basin of attraction of the domination of the international network is maximum.
In Fig. 3, the blue line indicates the critical line σ Ω ( ) c in the σ-Ω plane, which separates a phase in the parameter space where coexistence is possible (white region) and one in which only domination can occur (blue region). However, the increasing size of the basin of attraction of the domination of local networks above the critical point with respect to σ σ > Ω ( ) c can dramatically alter the fate of the system for a given set of initial conditions. Assume, for instance, that the international network dominates in the US and starts with a significant delay in each other country, which causes the local networks to dominate in those countries. At the time when the international network is launched globally, the state of the system can be approximated as follows . Hence, the initial conditions given by Eq. (8) reflect the fact that local networks dominate in the fraction (1 − β) of the system and the international one dominates in the remainder. The evolution of the basins of attraction makes the system approach different stationary solutions from these initial conditions for different parameters. Below the red line in Fig. 3, the system approaches the domination of local networks starting from the initial conditions given in Eq. (8). Above this line, the system either approaches coexistence (white area; crossing dashed red line) or domination of the international network (blue area; crossing solid red line). This means that in the red region, when the international network is launched globally, it is not able to overcome the initial advantage of the local networks due to its earlier launch.
To conclude, the double meanfield approximation predicts that intermediate values of the activity affinity most favor the international network; whereas the local networks can dominate for a high activity affinity and low global connectivity. We confirm these findings by numerical simulations in the following section. Numerical simulations and synthetic networks. In this section, we go beyond the meanfield approximation and study, by means of numerical simulations, the effects of the real topology of inter-country social ties and of underlying social structures. To this end, we use the air travel network (see Fig. 1c and Materials and Methods) as a proxy for inter-country social ties and construct 1:1000 scaled synthetic networks to model the structure of the 80 countries with most Internet users (see Table 1). To generate these networks, we make use of a model introduced in 26-28 , which produces realistic topologies of the traditional offline social networks, including heterogeneous node degrees and a high level of clustering (see Materials and Methods). Figure 4 shows results from our model for the set of parameters that best matches empirical observations, as explained in the following section. The international network starts with a delay in all countries except the US; so that initially in these countries the respective local network dominates. After some time, the international network obtains a significant advantage and quickly takes over in most countries.
To further study the properties of the model presented here, we define the relative prevalence of the international network compared to local networks as: are the activities of the international and local networks in country i in the stationary state. With this definition, a value of Φ ≈ 0 implies that local networks dominate in most countries, whereas Φ ≈ 1 corresponds to the domination of the international network. The relative prevalence of the international network averaged over many realizations is shown in Fig. 5 for different values of α as a function of the activity affinity, σ, and the launch time delay, Δ t. For small values of Δ t, we observe that when σ is small, the international and local networks coexist and we observe values around Φ ≈ . 0 5 for the relative prevalence; then, increasing σ favors the international network, which dominates for values of  σ . 0 5 (see Fig. 6a,b). For larger values of Δ t, this behavior smoothly translates into a more complex case, which we discuss below.
We observe in Fig. 5 that for launch time delays Δ t ≥ 2, the actual length of the delay becomes irrelevant. This behavior corresponds to the limit of saturation of the evolution of local networks before the international OSN is launched; as discussed in the previous section. We consider this limit by averaging over regions with Δ t ≥ 2 in Fig. 6c, which yields a two dimensional parameter space σ-α. Indeed, numerical simulations of the full model confirm the results from the meanfield analysis; in particular the complex role of the activity affinity σ. For small α and σ, local networks and the international OSN can coexist. Increasing σ or α favors the domination of the international network, which gives rise to the blue "V"-shaped region around σ = 0.5. This corroborates the saddlenode bifurcation predicted by the double meanfield approximation. See supplementary video for an explicit realization. For high values of σ and small values of α (red region in the bottom right-hand corner of Fig. 6), local networks dominate. Note that partial states are also possible, in which the international network dominates in some countries and local networks dominate in the remaining countries. See supplementary video for an explicit realization of this case.
Between the regions of domination of the international network and of local networks, there is a region in which the final fate of the system varies significantly between different realizations of the model ("coinflip region"). In this region, if the international network wins initially in the US, it will become dominant globally; otherwise, local networks maintain their initial prevalence. Although in this region the prevalence of the international network averaged over many realizations is about 0.5, as in the coexistence region in the bottom left-hand  in 3 ). The relative prevalence of the international network, given by ρ ,int a ,loc a , is color coded. We consider the international network to be banned in China and Iran. To model this, we set the values of Ω ij = 0 for each entry which involves one of these countries. This is equivalent to assuming that in these countries two local networks compete without any coupling to the rest of the world. Maps created with Mathematica, Version 9, https://www.wolfram.com/ mathematica/. corner of Fig. 6, the behavior of the system differs dramatically from one to another. In the coexistence region, each realization of the model leads to the same final state: coexistence of local networks and the international OSN. In contrast, in the coinflip region, coexistence is not possible, as this region of the parameter space corresponds to the supercritical regime (the blue area in Fig. 3). In the coinflip region, about 50% of the realizations end up with domination of the international network, whereas the remaining 50% lead to the domination of local OSNs. As a consequence, even if we know the exact parameters, it is impossible to predict the fate of the system beforehand.
We can summarize these findings as follows. A higher value of α, which is a measure of the global connectivity of society, favors the prevalence of the international network and hinders the survival of the local ones. The role of the tendency of individuals to participate in more active networks (activity affinity), σ, is particularly interesting. Low values allow the networks to coexist, whereas intermediate values always lead to the prevalence of the international network and the extinction of local OSNs. A high activity affinity, however, enables the prevalence of local networks and thus can even lead to the extinction of the international network.
Comparison with empirical data. In this section, we compare the results of our model with empirical data on the recent expansion of Facebook at the cost of many local networks. In particular, we consider the evolution of the number of countries in which local networks (i.e. networks that are not Facebook) are the most popular ones, as measured in 29 using Alexa traffic data (see Fig. 7b). We observe a significant decline of this number, which rules out the possibility that the empiric case corresponds to the domination of local networks. Because the past can be considered a single realization of a stochastic process 30 , the empiric case can still be within the coinflip region of our model where -by chance-the international network was more successful. Hence, we will perform the following comparison only for realizations of our model in which local networks do not dominate.
The intrinsic timescale of the model is arbitrary and hence has to be mapped to real time. The optimal mapping is given such that it produces the best agreement with the empirical data. We quantify the agreement between model results and empirical data using the sum of the squared distances between the data points and model results. In particular, we use the χ 2 statistic defined as: , where a 1 is the starting year and a 2 represents the time stretch: how many years of real time correspond one model time step. For a given set of parameters, σ, α, and Δ t, the optimal values for a 1 and a 2 are those that minimize χ 2 , as shown in Fig. 7a.
We can also use the χ 2 statistic to estimate the parameters α, σ, and Δ t which best reproduce the empirical observations. In Fig. 7c, we plot the values of χ 2 as a function of α, σ, and Δ t, where -at each point-we applied the respective best time mapping, as described above. These results are averaged over several realizations of the model; however, in the coinflip region we exclude realizations where the local networks dominate, to mimic the empirical case. Interestingly, the overall best fit is achieved for α = 2 at σ = 1.25 and Δ t = 3, which lies in the coinflip region (the optimal value of χ 2 is statistically consistent with the model, given the number of degrees of freedom in the data) with a probability for domination of the international network of 70%. This scenario corresponds to the time mapping a 1 = 2006 and a 2 = 0.6, meaning the system started at the beginning of 2006; while the launch time delay of Δ t = 3 in the model translates to 1.8 years in real time. In Fig. 7b, we show the evolution of the number of countries where local networks are more popular for the optimal fit from the model.

Discussion
Understanding the complex dynamics of the digital world constitutes an important challenge for interdisciplinary science. To meet this challenge, here we describe the the worldwide web as a complex, digital ecosystem in which interacting networks play the role of species in competition for survival. In particular, we study the competition between local networks operating in single countries and an international network that operates in all countries. Therefore, a proper description of this system must necessarily involve the network of worldwide social interactions between different countries.
We show that the effect of inter-country social ties can be mapped to the increased fitness of the international network by means of an effective activity. Interestingly, there is a critical global coupling strength below which networks can coexist. However, above that threshold, only domination is possible: in general, local networks become extinct with a high probability. Yet, we find that if local networks are launched earlier they can persist and dominate the international network, which happens only if local networks have accumulated a sufficiently large active userbase when the global launch of the international network takes place. The accumulation of a sufficient base depends on the parameters; and for certain parameters on chance. For these parameters the final state of the system -whether local networks dominate or become extinct-can be completely unpredictable, as it varies randomly between different realizations of the model.
Quite remarkably, a thorough comparison of our model with empirical data from the recent takeover of Facebook indicates that the most probable launch date of Facebook was at the beginning of 2006 and its global launch was in late 2007. Facebook was in fact started in 2004, but opened to the public in 2006; in good agreement with the estimate from our model. Moreover, according to Google trend data (see Supplementary Materials), 2007 was the year when the global search volume for Facebook started to increase rapidly. Last but not least, our best estimation of the model parameters corresponds to the "coinflip" region, which means that the observed takeover of Facebook only had a probability of around 70%. With a 30% probability, we would have been living in a world where each country had its own successful local network and a network like Facebook would not exist 30 .
Our findings suggest interesting future lines of research. On the one hand, even without adjusting the parameters on a country-by-country level, our model reproduces the main features empirically observed in the takeover of Facebook and the extinction of local networks in most countries for a certain parameter region. It remains an interesting task for future research to further increase the precision of the model. This could be done by improving the proxy for the similarity between countries or by adjusting parameters on a country-by-country basis. On the other hand, the model could be extended to account for several international networks and to study their global competition. For a second international network to overcome the first, a certain minimal difference of fitness is needed; which could be the result of different properties of the networks, such as features or functionalities. Finally, random fluctuations of fitness could be incorporated to describe Darwinian selection in the digital ecosystem.