Voter model on networks partitioned into two cliques of arbitrary sizes

The voter model is an archetypal stochastic process that represents opinion dynamics. In each update, one agent is chosen uniformly at random. The selected agent then copies the current opinion of a randomly selected neighbour. We investigate the voter model for a network with an exogenous community structure: two cliques (i.e. complete subgraphs) randomly linked by $X$ interclique edges. We show that, counterintuitively, the mean consensus time is typically not a monotonically decreasing function of $X$. Cliques of fixed proportions with opposite initial opinions reach a consensus, on average, most quickly if $X$ scales as $N^{3/2}$, where $N$ is the number of agents in the network. Hence, to accelerate a consensus between cliques, agents should connect to more members in the other clique as $N$ increases but not to the extent that cliques lose their identity as distinct communities. We support our numerical results with an equation-based analysis. By interpolating between two asymptotic heterogeneous mean-field approximations, we obtain an equation for the mean consensus time that is in excellent agreement with simulations for all values of $X$.


Introduction
Opinion formation in social networks has become an active field of research in statistical physics (for reviews, see [1][2][3]). In particular, the voter model [4,5] has become a paradigmatic model for opinion dynamics. Its rules are simple but sufficiently powerful to reproduce the summary statistics of real elections [6]. In the basic version of the voter model, the vertices on a network represent agents that can hold exactly one of two possible opinions: 'red' or 'blue'. Repeatedly, one agent is selected independently and uniformly at random. This agent then adopts the opinion of a randomly chosen adjacent agent (figure 1). As long as the network is connected and finite, this update rule guarantees that agents must eventually reach a consensus [7], defined as a state in which all agents have identical opinions.
The mean time until consensus depends on the initial distribution of opinions and the network structure. While early studies of the voter model focused on complete graphs [8] or regular lattices [9], interest has recently shifted towards networks with more complex topologies, e.g. small-world networks [10,11], graphs with right-skewed degree distributions [12,13], multiplex networks [14], or networks with a community structure [15][16][17].
A subgraph of a network is called a community if there are significantly more edges within the subgraph and fewer links to the rest of the network than those predicted by a null model that has no planted community structure (e.g. an Erdős-Rényi graph with the same total number of edges as the network under investigation) [18]. The detection of communities from network data has become a major line of research with a plethora of different algorithmic approaches [19]. Various techniques have confirmed that essentially all networks of practical relevance contain more than a single community [20][21][22][23][24].
This finding has motivated us to analyse the voter model for one of the simplest possible types of multi-community networks-namely, networks with exactly two communities. Situations where agents divide into two communities are plentiful in the real world. For example, a split between communities may arise because of a language barrier (e.g. between Dutch and French speakers in Belgium [25]) or differences in race, ethnicity, age, religion, education, occupation, or gender [26]. Conversely, agents with similar attributes tend to form close-knit communities because of status homophily [27], a social phenomenon that causes the proverbial birds of a feather to flock together. Within communities, cohesion often reaches such an extent that 'we can observe in many groups a social unity within which people feel at one though their opinions still differ' (p. 229 in [28]).
Previously, it has been claimed that the voter model is insensitive to changes in the community structure [29]. This conclusion has mainly rested on results for two equally large cliques (i.e. complete subgraphs), where the mean consensus time is proportional to the total number of vertices in the network, N , unless the connections between cliques are extremely sparse [16,30]. Here, we argue that an investigation of only the special case of equally large cliques does not do justice to the actual complexity of the problem.

Clique 1 Clique 2
A B C Figure 1. Illustrative example of a small two-clique network. In this example, clique 1 is a complete subgraph with seven vertices (circles), whereas clique 2 has only five vertices (squares). Each vertex represents an agent that has exactly one of two possible opinions: 'red' or 'blue'. We apply the update rules of the voter model. That is, we first choose a random focal vertex, e.g. A, in the depicted network. Then, we choose a random neighbour of the focal vertex and copy the neighbour's opinion. In our example, if the chosen neighbour is B, A changes its opinion to blue. If the chosen neighbour is C, A keeps its current (i.e. red) opinion. We distinguish between intraclique edges (thin lines) and interclique edges (thick lines). In our analysis, we vary the relative sizes of the two cliques and the number of interclique edges.
In this article, we revisit the two-clique voter model but allow unequal clique sizes. The dynamics exhibit an intriguing feature: the mean consensus time is minimal at an intermediate interclique connectivity. We investigate in detail the case where cliques with given relative sizes start from opposite opinions, representing a completely polarised society. To minimise the mean consensus time, we find that the optimal number of interclique edges, X min , should scale in proportion to N 3/2 . This scaling law puts X min between the case of a constant number of interclique links per agent (X ∝ N ) and a complete graph (X ∝ N 2 ).
After specifying the details of our model in section 2, we present the results from numerical simulations in section 3. In section 4, we derive an analytical expression for the mean consensus time as a function of X for arbitrary clique sizes. Our derivation demonstrates how we can go beyond previous approximations [16,31] to obtain not only the asymptotic behaviour for either extremely sparsely or densely connected cliques, but also reliable predictions for the intermediate interclique connectivity. We conclude with a discussion of our results in section 5.

Model
We consider a simple undirected graph with N vertices that can be partitioned into two cliques, as shown in figure 1. We denote the fraction of vertices in the first clique by α ∈ (0, 1). The two cliques are connected by X edges randomly selected from all α(1 − α)N 2 possible pairs that can be formed by one vertex in clique 1 and another vertex in clique 2.
Each vertex is either red or blue depending on the current opinion of the corresponding agent. The time intervals between consecutive opinion updates are Table 1. Transitions from the state (ρ 1 , ρ 2 ) and their rates.

New state
How is the new Transition rate matrix element (y, z) state reached? Q[(ρ 1 , ρ 2 ), (y, z)] the opinion of a red agent.
the opinion of a red agent.
the opinion of a blue agent.
Negative sum of all rates above.
independent, identically distributed exponential random numbers so that the dynamics are a continuous-time Markov chain. We choose the time unit such that every individual agent is active with a rate equal to 1.
If we wish to keep track of all individual opinions, the cardinality of the model's state space is 2 N . The Monte Carlo algorithm behind all numerical data presented in this paper is in fact based on this exact agent-based paradigm. However, summarising and modelling the results at such a fine level of resolution is neither insightful nor practical given that the number of configurations grows exponentially with N . Instead, we combine all configurations whose fraction of red agents is ρ 1 in clique 1 and ρ 2 in clique 2 into the macrostate (ρ 1 , ρ 2 ) to simplify the data analysis, visualisation, and mathematical modelling. Strictly speaking, the Markov chain is not lumpable at this macroscopic level [32] because we neglect the fact that different vertices in a clique can be adjacent to a different number of vertices in the other clique. Let us denote the number of interclique edges incident on a vertex v by k inter,v . The probability distribution of k inter,v is approximately binomial; thus, it is so concentrated near its peak that we withhold little information if we replace the exact value k inter,v with its mean: k inter,v ≈ X/(αN ) for all vertices v in clique 1 and k inter,v ≈ X/[(1 − α)N ] for every v in clique 2. In the parlance of statistical physics, we apply a heterogeneous mean-field approximation [7,33]: we correctly account for the difference in the clique sizes and replace the exact microscopic interactions with an average over the cliques. More elaborate approximations are conceivable [16], but the mean-field approximation is remarkably accurate, as we will see shortly. On balance, we do not find the notational burden of a more detailed approximation to be worth the effort.
Applying these simplifications, we can derive the transition rate matrix Q. For example, if a blue agent in clique 1 becomes red, the state changes from (ρ 1 , ρ 2 ) to ρ 1 + 1 αN , ρ 2 . This transition occurs with a rate that is the product of the following two factors. The first factor is the number of blue agents in clique 1, which is equal to αN (1 − ρ 1 ). The second factor is the fraction of adjacent agents whose opinion is red. Because an agent in clique 1 is connected to αN − 1 agents in clique 1 and, on average, to X/(αN ) agents in clique 2, we obtain the transition rate With similar arguments, we can also deduce the remaining elements of Q. In table 1, we list all nonzero transition rates. As is convention, we set the diagonal terms of Q equal to the negative sum of all other terms in that row [34]: . For our simulations, we apply the exact agent-based update rules of the voter model and take the exact network topology into account where the degrees are not the same for all vertices in a clique. For the analytical solution in section 4, however, we resort to the approximations that are implicit in Q.

Simulation results
To build intuition about the model, we show how the dynamics unfold during several sample runs with N = 500 and α = 0.8 in figure 2. We start the cliques in a state of complete polarisation: within each clique, opinions are initially unanimous, but there is disagreement between cliques so that either (ρ 1 , ρ 2 ) = (1, 0) or (ρ 1 , ρ 2 ) = (0, 1).
In figure 2(a), there are only X = 10 interclique edges; thus, it is difficult for an opinion to invade the clique that started from the opposite opinion. Fluctuations occur in only one clique at a time. Meanwhile, the other clique remains almost unwavering in its support of its starting opinion. As a consequence, the trajectory shown in figure 2(a) mostly remains around the edges of the two-dimensional state space (ρ 1 , ρ 2 ) ∈ [0, 1] 2 . After a protracted tug of war, confidence in the starting opinion ultimately vanishes in one of the cliques-usually the smaller one, with a probability that we will quantify in equation (7) below-so that the system reaches one of the two absorbing states (0, 0) or (1, 1).
By contrast, if X = 10 000, the proportions of red agents ρ 1 and ρ 2 rapidly approach equality, as shown in figure 2(b). Afterwards, the dynamics in one clique almost instantaneously follow the trends in the other clique so that the trajectory is confined to the vicinity of the diagonal line ρ 1 = ρ 2 . In this case, the cliques behave as one integrated entity despite being only loosely connected by the network topology.
The distinct behaviours of the model for small and large X lead to substantially different consensus times, which are evident when comparing the limits of the colour bar legends in figures 2(a) and 2(b). For X = O(1), it can take an extremely long time to reach a consensus because the cliques hardly exchange any opinions. If X 1, the cliques communicate more frequently with each other and therefore typically agree on a final opinion sooner. However, the mean consensus time does not necessarily monotonically decrease with X, as we will now see. We denote the consensus time from the completely polarised initial state (ρ 1 , ρ 2 ) = (1, 0) by T pol and its mean by T pol , which is an average over different realisations of the stochastic voter-model dynamics and over different randomly sampled networks with X interclique edges. In figure 3(a), we show simulation results for T pol as a function of X for N = 1000 and different values of α. Because the dynamics for α and 1 − α are identical if we exchange the labels of the cliques and opinions, we only plot results for α ≥ 1 2 . For all values of α, T pol attains its maximum at X = 1 and initially decreases as we insert more interclique edges, consistent with the intuition that more connections lead to a faster consensus. Surprisingly, however, if α = 1 2 , the trend reverses as we keep increasing X: T pol passes through a minimum and then increases again as the network becomes a complete graph, where X = α(1 − α)N 2 . This increase is more pronounced when the difference in clique sizes is larger. For α = 0.9, we can reduce the mean consensus time by ≈ 76% if we cut ≈ 98% of the interclique ties in the complete graph.
In figure 3(b), we fix α = 0.9 and vary N . In general, an increase in N shifts the curves towards larger values of T pol and X. However, the curves' overall shapes remain similar. The common pattern behind the data plotted in figures 3(a) and 3(b) becomes axes. We plot ξ = X/N along the horizontal axis and τ = 1 T pol /N along the vertical axis. The dashed line represents the reciprocal relationship τ = 1/ξ. (d) The number of edges X min that minimises T pol follows the power law X min ∝ N 3/2 predicted by (14). clearer in figure 3(c), where we plot the rescaled variable τ = 1 T pol /N versus ξ = X/N . For ξ 1, the rescaled functions collapse onto the same function τ = 1/ξ. The scaling relation τ ∝ 1/ξ was pointed out for the special case α = 1 2 in [16]. Our simulations and the equation-based analysis in section 4 show that τ and 1/ξ are not merely proportional but equal in the limit ξ → 0. This result is valid for all clique sizes assuming a completely polarised initial state. For other initial conditions, we also find that τ ∝ 1/ξ but with different proportionality factors, which can be calculated with the method presented in section 4. Figure 3(d) reveals another emergent scaling relation. In this scatterplot, the abscissa is the network size N . The ordinate is the number of interclique edges X min that minimises T pol . For each combination of N and α in figure 3(d), we perform 2000 Monte Carlo simulations. We then estimate X min from the locally estimated scatterplot smoothing (LOESS) regression curves and establish error bars with bootstrapping. For a fixed value of α, the data follow the power law X min ∝ N 3/2 . Thus, to minimise the mean consensus time, the agents must strike a balance between a sparse and a dense interclique connectivity. On one hand, the optimal number of interclique links per agent grows as k inter,v ∝ √ N . On the other hand, the optimal number of interclique edges X min is only a vanishing fraction of the number α(1 − α)N 2 of all possible interclique edges in the limit N → ∞.
In summary, the Monte Carlo simulations reveal three main features of the twoclique voter dynamics starting from cliques with opposite opinions. First, T pol is a U-shaped function of X as long as α = 1 2 . Notably, the global minimum does not coincide with a complete graph. Second, the mean consensus time obeys the identity τ = 1/ξ or, equivalently, as long as X N . Third, the number of interclique edges that minimises T pol satisfies the scaling relation X min ∝ N 3/2 . We now demonstrate how these results can be derived from the transition rates in table 1. and take the continuum limit. The result is the partial differential equation Finding an exact solution to equation (2) would be a formidable task, but the result would not be directly useful. Instead, we aim for an approximate solution. First, we find a solution that is valid if X = O(N ). Afterwards, we derive an approximation for the case where X N . Finally, we interpolate between these two approximations to arrive at a solution that fits the data remarkably well over the entire range from X = 1 to the complete graph with X = α(1 − α)N 2 . The solid curves in figure 3 are based on this interpolation.

Approximate solution if X = O(N )
For a sparse interclique connectivity, the leading terms of equation (2) up to and including O (N −1 ) are We are not aware of an exact solution to equation (3), but we assume that it can be expressed as a power series. The main features already become apparent when only expanding up to the quadratic terms. We denote this approximation by t sparse to indicate that this expression is valid if we only have a sparse connectivity between cliques: Because T is symmetric with respect to 1 2 , 1 2 , we must have c i,j = 0 if either i is odd and j is even or vice versa. Only five coefficients remain that can possibly be nonzero: c 0,0 , c 0,2 , c 1,1 , c 2,0 , and c 2,2 . We can determine these coefficients from equation (3) and the boundary conditions. We skip the details here and instead refer to the appendix,  (5) and shown as a dotted line, fits the data well if the cliques are sparsely connected (i.e. X < N ) but loses accuracy if we insert more edges between cliques. Equation (9) presents the alternative approximation t dense (dashed line), which is a much better estimate than t sparse if X > N but is worse for a sparse interclique connectivity. In (10), we define an interpolation t interp that asymptotically behaves like t sparse for small X and t dense for large X. Even for intermediate X, t interp closely approximates the simulation data (solid line).
where we show that with the auxiliary function in the denominator.
In figure 4, we compare the numerical data for α = 0.9 and N = 1000 with the approximation in equation (5), shown as a dotted line. In the limit X/N → 0, t sparse is an excellent fit because the asymptotic behaviour of equation (5) is consistent with equation (1). For large X, however, equation (5) predicts a consensus time that is too short. To resolve this problem, we now derive an approximation that is more suitable if X is large.

Approximate solution if X N
The probability of reaching a red consensus from the initial condition (ρ 1 , ρ 2 ) is, in general, given by the martingale m(ρ 1 , ρ 2 ) that satisfies m(0, 0) = 0, m(1, 1) = 1 and [34] y,z Q[(ρ 1 , ρ 2 ), (y, z)]m(y, z) = 0 for all (ρ 1 , ρ 2 ) ∈ {(0, 0), (1, 1)}. By inserting the formulae for the elements of Q from table 1, we can verify that the solution is This result is valid regardless of whether X is small or large. If the cliques are densely connected, we have seen in figure 2(b) that we can assume that ρ 1 = ρ 2 after a short transient. Similar adiabatic approximations have been applied, e.g. in [12,30,31,36,37]. By inserting ρ 1 = ρ 2 into equation (7), it follows that the fraction of red agents in each clique is equal to m. Thus, we can substitute m for ρ 1 and ρ 2 in equation (3). Bearing in mind that for i ∈ {1, 2} and keeping only the leading-order terms, we obtain the following secondorder ordinary differential equation: We call the solution to equation (8) t dense , where the subindex 'dense' expresses that the equation is derived under the assumption that X N . The absorbing boundary condition t dense (m = 0) = t dense (m = 1) = 0 uniquely determines the solution Figure 4 confirms that t dense fits the data from the Monte Carlo simulations in the range X N . In particular, t dense correctly predicts an increasing mean consensus time for large X. A closer look at equation (9) reveals that t dense increases because the minority opinion gains a slightly higher probability of winning. For a polarised initial condition for a network with N = 1000 and α = 0.9, we find that m(1, 0) ≈ 0.98 if X = 1, but m(1, 0) = 0.9 if the graph is complete (i.e. X = 90 000). Hence, the blue minority increases its probability of winning from 2% to 10%. At first glance, the difference in m may seem to be small, but its effect is amplified by the nearby singularity of the function ln(1 − m), which appears on the right-hand side of equation (9). As a consequence, t dense increases by a factor of approximately 5.4 as X increases from 1 to 90 000. We conclude that networks designed for a fast consensus must strike a compromise between two opposing trends. On one hand, frequent opinion exchanges between the cliques are necessary to quickly agree on the same opinion. On the other hand, additional interclique edges give the minority clique greater influence, causing more self-doubt within the majority clique and consequently slower convergence towards a shared opinion.

Derivation of X min ∝ N 3/2
While t dense is an excellent approximation of the simulated data if X N , it unfortunately underestimates the true value of T in the range X < N . In this sense, t dense is the opposite of t sparse from section 4.2: we found that t sparse approximates T well for small X but substantially deviates for large X ( figure 4). To obtain the benefits of both t dense and t sparse but none of their disadvantages, we construct an interpolation t interp as follows. We first add t dense and t sparse and then subtract the asymptotic value of t sparse in the limit of a dense interclique connectivity: where we have explicitly included X among the independent variables. This interpolation approximates the true value of T in the limit of a minimal or maximal interclique connectivity and is also an excellent approximation for all intermediate values of X. The solid curves in figures 3(a), 3(b), and 4 confirm that t interp fits well for all X.
Equipped with an approximation of T , we can now determine how many edges must be inserted between polarised cliques to minimise the mean consensus time. From equation (10) and the condition ∂t interp /∂X = 0 for the minimum, it follows that we are looking for the solution X min of the equation To simplify the calculation, let us assume that X min increases between linearly and quadratically in N . Expressed in formal notation, we assume that 1 N = o(X min ) and X min = o (N 2 ). In this case, we can expand t sparse /N and t dense /N as Taylor series in terms of N/X and X/N 2 , respectively. Rearranging equations (5) and (9), we find that We now combine equations (11)- (13) and drop the higher-order terms. The result is with

Discussion
In this article, we have studied the voter model for one of the simplest types of community structure: two cliques connected by a fixed number of edges. Previously, equations were only available for the special case of two equally large cliques. Even for this special case, only the asymptotic behaviour for either an extremely sparse or extremely dense interclique connectivity was known [16]. Here, we have introduced a heterogeneous mean-field approximation and an interpolation technique that allow us to treat cliques of unequal sizes. Furthermore, equation (10) makes a prediction for the mean consensus time that goes beyond a mere scaling law with an unknown proportionality constant. Instead, we can calculate concrete numbers that are in excellent agreement with Monte Carlo simulations for any number of interclique edges, X, including cases where the adiabatic approximation at the heart of [12,30,31,36,37] fails. In particular, equation (14) predicts the number of interclique edges, X min , necessary to minimise the mean consensus time. Our derivation of equation (14) reveals that, at the optimum, the smaller clique must be exposed to the majority opinion, but we must not allow the smaller clique to influence the larger clique too strongly. The result X min ∝ N 3/2 exemplifies how our methodology is able to answer a sociological question with a specific and surprising quantitative prediction.
We have considered the scenario where the cliques have different sizes, consistent with empirical observations that community size distributions tend to be highly heterogeneous [38,39]. Still, real community structures are considerably more complex than our model. For example, communities in real networks are typically much sparser than cliques [38]. Moreover, real communities are not necessarily as clearly separated from each other as in our model. Instead, the boundaries between communities are often fuzzy so that vertices can often not be uniquely attributed to a single community [39]. Even if communities do not overlap, it is highly restrictive to assume that their number is exactly equal to two.
Besides assuming a stylised network topology, we have also applied a particularly simple update rule. In our model, agents can choose between only two different opinions, which must be truthfully signalled to all neighbours. A more sophisticated model may distinguish between private and publicly displayed opinions [37], thereby giving agents the opportunity to be hypocrites (i.e. they may represent an opinion in public that is contrary to their inner belief) [40]. If there are more than two possible opinions, yet more complex update rules are conceivable [41]. Further potential model variants include zealots who never change their opinions [17,42,43] or agents who query more than a single neighbour before switching opinions [44,45]. Updates may happen simultaneously instead of asynchronously [46]. The distribution of waiting times between updates may be more right-skewed than an exponential distribution [47,48]. There may even be different waiting time distributions for different agents [49]. These and many more modifications of the basic voter model have been previously studied [5]. It would be interesting to investigate how the two-clique topology influences the dynamics in these cases.
The voter model is not only relevant in the context of opinions in social networks. It can also be interpreted as a model for language evolution [50,51], where the state of an agent is a linguistic token instead of an opinion. In this context, a two-clique topology may represent a society that is split into two groups because of geography (e.g. a language island separated from the mainland). While a quick consensus may be preferable in the context of opinion formation, the extinction of language variants is a cultural loss that should be avoided or at least delayed. Because the deliberate removal of interclique edges can hardly be socially desirable, our model suggests that the best way to extend the lifetime of a language variant is to increase the size of the minority clique.
Even before the voter model appeared in the sociological and physics literature, it had been introduced in biology, albeit under different names. For example, the Moran process represents the spread of alleles in a population with a model that is-at least for the panmictic population considered in Moran's 1958 paper [8]-equivalent to the voter model. Other biologists have interpreted the two-dimensional voter model as a competition for territory between species [9,52]. From a biological perspective, the voter model on a network with two communities may be viewed as a direct implementation of Wright's island model at first glance, where 'the total population is assumed to be divided into subgroups, each breeding at random within itself, except for a certain proportion of migrants' [53]. Still, there is a subtle but important difference between the voter model and the Moran process (also known as the invasion process [30]). When interpreted in the context of opinion formation, it makes sense to assume that the focal agent adopts the opinion of a random neighbour. In biology, by contrast, the interaction between the focal vertex and its neighbour is usually in the opposite direction: the offspring of the focal vertex spreads the parent's state to a neighbouring site. On a degree-regular network (e.g. a complete graph, as in Moran's paper [8]), both update rules lead to the same stochastic process. For heavy-tailed degree distributions, however, the two update rules are known to result in substantially different dynamics [12,30,54]. The degree distribution of a two-clique network is not heavy-tailed but bimodal with peaks at αN + X/(αN ) and (1 − α)N + X/[(1 − α)N ]. Whether this topology causes a difference between the voter model and the Moran process is a question for future research. The methodology we have presented in this article opens the door to such studies of voter-like models for networks with a community structure. and comparing the constant terms on the left-and right-hand sides of the equation, we find that (1 − α) c 2,0 + α c 0,2 = −2α(1 − α)N. (A. 2) The blue consensus (ρ 1 , ρ 2 ) = (0, 0) is an absorbing state; therefore, we demand that t sparse (0, 0) = 0 or, equivalently, 16c 0,0 + 4 (c 2,0 + c 1,1 + c 0,2 ) + c 2,2 = 0.
In the special case of a polarised initial condition, equation