Temporal prediction of epidemic patterns in community networks

Most previous studies of epidemic dynamics on complex networks suppose that the disease will eventually stabilize at either a disease-free state or an endemic one. In reality, however, some epidemics always exhibit sporadic and recurrent behaviour in one region because of the invasion from an endemic population elsewhere. In this paper we address this issue and study a susceptible-infected-susceptible epidemiological model on a network consisting of two communities, where the disease is endemic in one community but alternates between outbreaks and extinctions in the other. We provide a detailed characterization of the temporal dynamics of epidemic patterns in the latter community. In particular, we investigate the time duration of both outbreak and extinction, and the time interval between two consecutive inter-community infections, as well as their frequency distributions. Based on the mean-field theory, we theoretically analyze these three timescales and their dependence on the average node degree of each community, the transmission parameters, and the number of intercommunity links, which are in good agreement with simulations, except when the probability of overlaps between successive outbreaks is too large. These findings aid us in better understanding the bursty nature of disease spreading in a local community, and thereby suggesting effective time-dependent control strategies.


Introduction
Many social, communication, and biological systems of current interest to the scientific community take the form of networks -sets of nodes (or vertices) joined together in pairs by links (or edges) [1,2] -with wide practical applications ranging from searching on the Internet [3] to epidemic modelling [4]. One of the most interesting features of the network is the presence of community structure (or modularity), i.e., the division of network nodes into groups such that there is a higher density of links within groups than between them [5,6,7,8,9,10]. For instance, communities in a social network could be real social groupings, by interest or background; communities on the web could be pages on related topics.
Recent studies on network-based epidemic modelling have taken into account the effect of the community structure [11,12,13,14]. It has been reported that the community has a great impact on the magnitude of an outbreak peak and the final prevalence [15]. Notably, some epidemic models concerning communities have been extended to metapopulation [16,17] and interconnected (alternatively described as coupled or layered) networks [18,19] -networks consisting of interconnected and interdependent sub-networks or communities -and have revealed that in the case of two weakly coupled networks, a new stable state may emerge, in which the disease is endemic in one network but neither becomes endemic nor dies out in the other [20,21]. This observation opposes most previous studies that share an implicit assumption that the disease will finally enter a steady state, either endemic or disease-free [22].
Moreover, it has been frequently observed that some real diseases, particularly the common zoonoses (diseases capable of cross-species transmission), often trigger sporadic and recurrent human infections. Zoonotic pathogens, in particular, are the major source of emerging and re-emerging infections in humans [23]. For instance, the highly pathogenic H5N1 influenza A virus remains a zoonotic infection and is an endemic in avian populations, while it rarely infects humans and is currently unable to sustain human-to-human transmission [24]. Therefore, the continuing reservoir of circulating influenza among the bird population and the direct contacts between birds and humans (e.g., workers in poultry farms) promote potential re-emergence of human infection, posing the threat of an influenza pandemic [25]. In such cases, the disease neither persists nor becomes extinct forever in the human population. In one region (or community) the disease neither persists nor permanently vanishes, rather the disease experiences sporadic and repeated cycles of outbreak and extinction. The infection breaks out as the result of intermittent transmission of pathogen from outside via the inter-community links and then dies out because the infection rate is lower than the epidemic threshold in the local community.
Although such a phenomenon has been observed in [20,21], those references are primarily an investigation of the conditions necessary for the existence of such solutions. Neither the associated timescales nor the frequencies of outbreak and extinction have been satisfactorily investigated. Nevertheless, a detailed knowledge of the temporal patterns of an epidemic in a local community contributes to more sufficient preparedness in the face of potentially pandemic disease [25]. It is therefore of great importance to properly investigate how frequently such outbreaks and extinctions will happen, how long a single outbreak and extinction will last, and how long it takes for the necessary inter-community infection to occur. In this paper we address these problems by studying a susceptible-infected-susceptible (SIS) model [26] on a random network with community structure. In particular, we consider two interconnected communities with different average node degrees, each of which is an Erdös-Rényi (ER) random graph [27]. We propose analytical predictions based on the mean-field (MF) approach for these timescales and discuss their dependence on the average node degree of each community, the epidemiological parameters, and the number of inter-community links. Furthermore, we test our analytical results against extensive computational simulations and find good agreement -particularly when the probability of overlapping outbreaks is small. These findings shed new light on the bursty nature of disease spreading within a local community, which may help devise more efficient time-dependent containment policies.
The rest of this paper is organized as follows. In section 2 we describe the algorithm for generating a random network with community structure and introduce the epidemic model. Analytical predictions regarding the probability and the expected value of time spans over the system model are given in section 3, while the numerical results are illustrated in section 4. We conclude this work in section 5.

Network generation
In the present work we consider the simple case of a random network with two interconnected communities of different numbers of nodes and different densities of links. However, it can be easily extended to the general case that contains any number of communities of any size.
The ER random graph [27] is regularly used in the study of complex networks, since networks with a complex topology and unknown organizing principles often appear random [1]. In this paper, we generate a random network consisting of two interconnected ER communities A and B of sizes N A and N B respectively. The following is the generation process we adopt: (i) Assign each node to a single community, according to the communities' sizes.
(ii) Generate ER community A (B) with each pair of nodes in community A (B) being connected with probability p A (p B ), following the standard construction procedures for ER random graphs [27].
(iii) Join together by an inter-community link a randomly chosen node i with intracommunity degree larger than 1, k A i > 1, in community A and a randomly selected node j with intra-community degree larger than 1, k B j > 1, in community B. Here we specify the selections of nodes with respect to degree to ensure the definition of community in the strong sense [6] -that is, that each node of a community has more connections within the community than with the rest of the network. If a chosen node for inter-community connection in each community has already been connected by an inter-community link, do nothing and again randomly choose another node in the community.
(iv) Repeat step (iii) until there are L inter-community links between communities A and B.
The above process generates a random network with community structure, where each community has a Poisson distribution with regard to the intra-community node degree. The average intra-community node degrees of communities A and B are then , respectively. For analysis, we specifically construct a relatively dense community B and a sparse community A such that k B is much larger than k A . According to step (iii) both of the endpoints of each intercommunity link have an inter-community degree 1, k AB = 1. A sample of random network with community structure is illustrated in figure 1, where the green (triangular nodes) community (B) possesses a higher density of connections than the red (circle nodes) one (A).

Epidemic model
We consider the SIS epidemiological model taking place on the random network generated above, where nodes mimic individuals or hosts and links are the potentially infectious contacts among them. In this model, each node is either susceptible (S) or infected (I). Initially, there is a small fraction ǫ of infected nodes only in the dense community B while all the remaining nodes in the entire network are susceptible. Then at every time step, each susceptible node is infected at a transmission rate λ upon a contact with an infected node. Meanwhile, each infected node recovers and becomes susceptible again at a recovery rate γ. For simplicity, we do not differentiate the transmission parameters λ and γ for different communities. That is, we assume that for each inter-or intra-community link connecting an infected node with a susceptible node in the entire network, the transmission rate λ is identical, and for each infected node in the entire network the recovery rate γ is the same. However, this can be easily extended to the general case that assumes different transmission rates for different communities, as in [18,21].
Of great importance in epidemic modelling is the basic reproductive number R 0 , which denotes the average number of infections introduced by a single infected individual in a completely susceptible population [22]. This number characterizes the threshold behaviour of a disease in the sense that it spreads across a nonzero fraction of the population for R 0 > 1 while it dies out for R 0 < 1. Note that it is easier for the disease to spread through the denser community B than the sparser community A, since B has a larger basic reproductive number than A, i.e., [22]. The arguments for this expression of R 0 to hold in the ER random networks, are twofold. First, the ER random graphs is a typical example of homogeneous networks characterized by a Poisson distribution P (k) = exp(− k ) k k /k!, where most node degrees are close to the average degree, k ≃ k . Therefore each node can be assumed to have an identical number k of neighbours. In accordance with the homogeneous assumption [4], one gets R 0 = λ γ k . Second, in a completely susceptible pool, the expected number of susceptible neighbours that a newly infected node has is 20], thus according to the definition, R 0 = λ γ k . Here, the basic reproductive number of each community is calculated based on the respective average intra-community node degree, regardless of the inter-community links, since this is the critical value for the disease to spread within the single community. In addition, if R B 0 > 1, R A 0 < 1, and R  outbreaks and annihilations. As shown in figure 2, the number of infected nodes in the sparse community fluctuates between zero and positive values with time. This is because the disease is able to transmit from community B (where the epidemic persists) to community A through the inter-community links, which play a pivotal role in introducing new infections to the otherwise disease-free community. For analysis in the rest of this paper we choose parameter values such that R B 0 > 1, R A 0 < 1, and R Net 0 > 1. Remarkably, an analogous phenomena have also been observed recently in the study of epidemic spreading in interconnected or coupled networks, with the emergence of a new stable state in which the disease is endemic in one network but neither persists nor dies out in the others [20,21]. Rather than focusing on the conditions that permit this mixed phase [20,21], in the present work we probe further the timescales (temporal patterns) of epidemic dynamics in the sparse community. In particular, we examine the time durations of outbreaks and extinctions, and the time interval between two successive inter-community infections, as well as their frequency (probability) distributions. This issue has seldom (if ever) been explored in the literature. However, the direct answer can help us better understand how often infections and extinctions will happen and how long a single outbreak and a single extinction will last in a local community, and thus suggest more effective time-dependent preventive measures.

Mean-field Analysis
Denoting by x B (t) and y B (t) the fraction of susceptible and infected nodes in the dense community B at time t, respectively, one has the normalization condition x B (t) + y B (t) = 1. Since the interconnections between the weakly coupled networks only affect disease spreading in the sparse network [20], by disregarding the infections along the inter-community links we can get the time evolution of the fraction of infected nodes following the MF approach [26]: where the first term considers the infection of susceptible nodes due to intra-community links, which is proportional to the transmission rate λ, the average number k B of intra-community neighbours per node in community B, the density x B (t) of susceptible nodes, and the probability y B (t) that a randomly chosen intra-community neighbour is infected, while the second term describes the recovery process of infected nodes, which is proportional to the recovery rate γ and the average density y B (t) of infected nodes. With the assumption R B 0 > 1 and the stationary condition d dt y B (t) = 0, it is straightforward to get the nonzero solution in the steady state, which is the epidemic prevalence in community B. In fact, the nontrivial analytical solution to (1) can be written as where C 1 is a constant depending on the initial condition y B (0) = ǫ. As t → ∞, (3) again gives rise to the stable fixed point as in (2). The epidemic prevalence y B can also be translated into the probability that a randomly selected node in community B becomes infected. Let q be the probability of disease spreading through the inter-community links from community B into community A in one time step. Consider that there are m infected nodes among the L inter-community nodes in community B at a given time, with a probability L m y m B (1 − y B ) L−m , and at least one inter-community node in community A gets infected from the m inter-community links with a probability [1 − (1 − λ) m ]. Therefore, taking into account all possible numbers (m = 1, 2, . . . , L) yields If one defines T as the time interval between two consecutive inter-community infections, then there is no inter-community infection for a duration of (T − 1) time steps until the T th time step. As a result, the probability of the time span between two successive inter-community infections being T time steps is and the average time interval between two consecutive inter-community infections is In fact, (4) approximates to if one neglects all higher order terms with regard to λ in the expression [ Furthermore, taking T as continuous in (6) yields which can be simplified to T ≈ 1/q for q ≪ 1. This approximation treatment is feasible since the value of λ is set to be less than 0.01 and the value of λL set to be up to 0.04 for all simulations in this context. Therefore the average time interval between two successive outbreaks approximately scales as As shown in figure 2, community A is free of disease from the beginning until the epidemic from community B transmits into this sparse group via the inter-community links -causing an epidemic outbreak. However the infection can only last for a period of time, T i , which is defined as the time duration (or span) of the outbreak, since the basic reproductive number is too low (R A 0 < 1) for the disease to persist in community A. This means that for a period of T i time steps, the disease exists in community A. After that community A is disease-free for a period of time, T s , which is defined as the time duration (or span) of the extinction (or health), with all the nodes in the community being susceptible. This disease-free time ends as the next inter-community infection succeeds. In detail, let us assume that the disease initially comes into community A after T (0) time steps, and after that denote the series of time intervals between consecutive intercommunity infections by T (1) , T (2) , . . ., in sequence. Accordingly, we denote the series of time durations of successive outbreaks by T i , . . ., and of successive extinctions by T (1) s , T (2) s , . . ., see figure 3 for an illustration. It is complicated to derive theoretically the time spans T i and T s due to the potential effects of inter-community infections and the overlaps between successive epidemic outbreaks, which occur when a new inter-community infection emerges before the old disease dies out. However, there is a definite relationship between the values of these three time spans. As seen in figure 3, if two successive outbreaks do not overlap, then the t 0 number of infecteds i ] for the case of overlaps between two successive epidemic outbreaks. Note that the curves on top of the coloured areas do not represent the real number of infected individuals, this is just for illustration. time interval between these two infections is equal to the sum of the time duration of the former outbreak and the disease-free period before the latter outbreak. Mathematically, that is For example, from figure 3 one can obtain s , but can not get a time duration of health during the time interval of i . For analysis in theory, we specifically define a virtual (negative) time span of health as T (j) associated to the corresponding overlap. Such a treatment allows us to use the relationship given in (10) in the presence of overlaps between consecutive outbreaks.
Once a node in community A gets infected by an inter-community infection, it will trigger a temporal dynamical process in its local community before a subsequent inter-community infection occurs. Therefore, by omitting the effects of inter-community transmissions, one can write the evolution equation for y A (t), the density of infected nodes in community A, similar to (1), as Here, the second-order term arises from the normalization condition where x A (t) = 1 − y A (t) has been eliminated from the equation. One can also work out the nontrivial solution to (11) which resembles (3). Nonetheless, this can not be used to evaluate the time duration of an outbreak because this solution contains a constant closely related to the initial fraction of infected individuals y A (0), which is stochastic. Since d dt y A (t) < 0 for R A 0 < 1, the density of infected individuals decreases with time and is difficult to reach a relatively high level. Therefore, we neglect the second-order term with regard to y A (t) in (11) and obtain an approximate solution We estimate the time period of the outbreak by applying the concept of half-life (t 1/2 ) [28], generally defined as the time required for a quantity to diminish to half its value as measured at the beginning of the time period, which is typically used to describe a quantity that follows an exponential decay. Setting y A (t)/y A (0) = 1/2, one gets the half-life for the density of infected individuals. After n half-lives, the density drops from y A (0) to y A (0)/2 n . In the present work we use the time, which is required for the number of infected nodes I A t = y A (t)N A to drop from the initial value I A 0 = y A (0)N A to 1 2 (where the disease is close to extinction), to estimate the time duration T i of the outbreak [Strictly this accumulation of half-lives is not exactly the time span T i for the entire outbreak; this approximation, however, matches the qualitative behaviour of T i and can provide a similar scaling shape (if not the size), as shown in section 4.]. Thus, letting I A 0 /2 n = 1 2 gives rise to n = 1 + log 2 I A 0 . Consequently, the time duration T i is dependent on the initial number of infected nodes in community A in such a way that T i (I A 0 ) = nt 1/2 = (1 + log 2 I A 0 )t 1/2 . Since there are a number L of inter-community links between the two communities, all possible numbers of infected nodes at the beginning of the outbreak period are I A 0 = 1, 2, . . . , L as these I A 0 inter-community links pass the disease simultaneously with the probability By definition it is clear that L I A 0 =1 P (I A 0 ) = q, whereby we normalize P (I A 0 ) to obtain the average time duration of an epidemic outbreak as follows: (1 + log 2 I A 0 )P (I A 0 ). (15) For simplicity in theory, we estimate the average timescale of a single disease-free phase as using the expression in (10), regardless of whether there are overlaps between successive outbreaks. Now, we analyze the frequency distributions P (T ) and P (T i ) of the timescales T and T i , respectively. In accordance with (5), one has for a negligibly small q, since The approximate solution (12) implies that an epidemic outbreak's probability of still existing at a time t after its first appearance is exp we arrive at an exponential probability distribution

Average timescales
We test theoretical predictions of the proviso section with computational simulations of the epidemiological model over a random network with community structure. We consider two ER communities A and B with N A = 10 3 and N B = 2 × 10 3 nodes, respectively. We start the simulation with no infection in community A and a fraction ǫ = 0.5% of infected nodes in community B, and then let the epidemic spreading go through 10 5 time steps for each realization. By recording the starting time of each epidemic outbreak in community A arising from the inter-community infections, we measure the time interval T between two consecutive outbreaks. We compute the time period, during which at least one infected node exists in community A, as the time duration T i of an epidemic outbreak, and take the time period, during which there is no disease in this community, as the disease-free time duration T s of an extinction of disease. Note that the time periods T i and T s calculated in simulations are different from their theoretical estimates as long as there are overlaps between epidemic outbreaks. See figure 3 for example, in case of an overlap between the third and the fourth outbreaks, we numerically record one single epidemic outbreak with a time period [T (3) + T i ] rather than two individual outbreaks with time periods T To study the dependence of the average timescales on the network structure and the transmission parameters, we calculate T , T i , and T s by varying each of the  for the disease to spread from community B into community A), which suggests a shorter time period between two successive epidemic outbreaks. On the contrary, the property k A inherent in the sparse community A has no contribution to the inter-community infections which originate from the dense community B. Such behaivour of T related to each of these parameters, including the linearity shown in the log-log plot in the inset of figure 4(a), confirms the MF analysis (9) in section 3. All the theoretical results by the MF approach are smaller than the simulations, since the MF approximation in (1) has ignored the higher-order terms of y B , which leads to an overestimate for y B , and thus causes T to be underestimated according to (7) and (9). Figure 5 shows the mean time period T i of an epidemic outbreak in community A versus parameters L, k B , k A , λ, and γ, respectively. The theoretical prediction of T i remains almost unchanged with the increase of L, whereas the simulation result shows a monotonous increase. The reason for this is that the theoretical prediction is based on the half-life calculation (13) and only measures part of the real period of the outbreak and also neglects the effects of inter-community infections during the period of an epidemic outbreak in the sparse community. In view of the fact that the more the links between the two communities, the easier for the inter-community transmission to happen. Thus, one would expect a larger probability of overlaps between successive outbreaks [see figure 7(a)], which causes numerical counting of T i to be more likely to exceed the theoretical prediction. Therefore, it is hard to predict the average time period of an epidemic outbreak if there are a large number of inter-community links. A similar explanation can also be made for the difference between the analysis and simulation in figure 5(b), where the simulation results grow slightly with the increase of k B , whereas the analytical results are almost unchanged. The higher density of connections inside community B leads to a larger fraction of infected node in this community in the steady state, which suggests a greater probability of overlaps between epidemic outbreaks [see figure 7(b)]. However, the inter-community transmissions are heavily restricted since the number of inter-community links is set to L = 4, which can explain why the simulation results of T i increases slowly compared to figure 5(a). Moreover, we find from figure 5(c) and figure 5(d) that both the analytical predictions and the simulation results of T i rise with the increase of k A and λ, separately. For smaller values of k A and λ, the predictions are smaller than the simulations. This is due to the increase of k A and λ promoting the disease spread and thus enhancing the possibility of overlapping outbreaks, as seen in figure 7(c) and figure 7(d). However, after the values of k A and λ increase to a certain point, we see the reverse case, i.e., the theoretical results are larger than the numerical results. It stems from the simplification of (11) by discarding the second order term in y A . This approximation treatment generally produces relatively tiny errors if the second order term is far less than the first order term with respect to y A . On the other hand, when the values of k A and λ become large enough so that (γ − λ k A ) approaches zero and hence the first term closes to the second term in (11) even for a very low level of y A . In this case, neglecting the higher order term in y A will considerably underestimate the decaying speed of y A and hence greatly overestimate the outbreak duration. In addition, as shown in figure 5(e), both analytically and numerically the average time period of T i decreases with the recovery rate γ. With a higher recovery rate, the infected nodes will recover faster before they are able to spread the disease to susceptible nodes, therefore the epidemic outbreak will last for a shorter time. The theoretical prediction is smaller than the simulation result because the half-life approximation only captures part of the  (16) are smaller than the simulation results. This is to be expected since (16) covers both the positive part T >T i (T − T i )P (T − T i ) and the negative part T <T i (T − T i )P (T − T i ) whereas the simulation counting considers only the positive part by ignoring those situations which include an overlap between successive epidemic outbreaks. The negative part accounts for a larger proportion if the probability of overlapping outbreaks is larger, which will greatly reduce the accuracy of the theoretical estimates of T s .

Relative errors of T s
Based on the above discussion, the deviation between the theoretical and simulation results of T s decreases with the probability P (T > T i ) that the time interval between two successive outbreaks is larger than the time duration of the former outbreak (with 1 − P (T > T i ) being actually the probability of overlapping outbreaks). To confirm this, we plot in figure 7 the relative error ∆T s of the mean time duration of disease extinction T s , which is defined as and can help examine the accuracy of the theoretical prediction of T s . A smaller ∆T s denotes a more accurate prediction of T s . As demonstrated in figure 7, the accuracy of the theoretical estimates of T s decreases with the increase of L, k B , k A , and λ, respectively. It is attributed to a larger number of inter-community links (or a larger average node degree within each community or a larger transmission rate) leading to a larger likelihood of overlapping outbreaks. In particular, for large values of k A [see figure 7(c)] and λ [see figure 7(d)] the accuracy of T s becomes worse. Since k A or λ is large enough, the theory of T i is much larger than its simulation, causing T − T i to decline rapidly and become negative. Conversely, the theoretical prediction of T s is more accurate for a larger value of γ, as shown in figure 7(e). In addition, we observe an interesting fact that for all the parameters L, k B , k A , λ, and γ, the simulation results of T s are very close to those of T . As an example we demonstrate a comparison between the simulation results of T s and T as L varies in figure 8(a), and the numbers of events for both the inter-community infections and disease extinctions are also reported in figure 8(b). For a smaller value of L, the numbers of both events are closer, meaning that the frequency of outbreak overlaps is smaller, thus the difference between T and T s approaches to T i , which is significantly smaller compared to T at the small point of L. On the other hand, for a larger L, the more frequent overlaps of outbreaks offset the difference between T and T s by counting the small T that satisfies T < T i . So, the simulation results of T and T s are close for the whole range of L. Similar results are also found in the simulations for the other parameters.

Distributions of the epidemic timescales
It is clear from figure 9(a) that both the theoretical and numerical results of P (T ) follow an exponential distribution. Note that the simulation results have been counted for the frequency distribution of T and plotted in a histogram with an equal bin of width 200 time steps. The theoretical results depicted by the red line are obtained after multiplying the integral of (17) over each bin by the total event count (13812 in this example). Apparently, the theoretical results decay faster, with a steeper slope (−q), than the simulation. As mentioned before, the MF approximation in (1) overestimates the value of y B and hence, according to (7), the theoretical value of q is larger than the numerical one. As shown in figure 9(b), both the theory and the simulation exhibit an exponential frequency distribution of the time duration T i of an epidemic outbreak. Analogous to figure 9(a), the simulation results in figure 9(b) are plotted as a histogram, where the theoretical points present on the red line are given by integrating (19) over each bin of width 200 time steps and then multiplying the outcomes by the total event count -8439. For small values of T i [see the range 0 < T i < 400 in figure 9(b)], the simulation results of P (T i ) decrease faster than the analytical prediction by (19) because the approximation used in (12) neglects the second order term with regard to y A in (11) and thus underestimates the decaying speed compared with the simulation results. On the other hand, for larger values of T i , the simulation results decay slower than the theory, because the frequent overlaps between consecutive epidemic outbreaks help extend the outbreak duration in the simulation computing. Since both P (T ) and P (T i ) are exponential, the relationship given by (10) allows us to expect an exponential frequency distribution for T s , which is confirmed by the simulation results shown in the histogram of figure 9(c).

Distributions of the peak heights
Inspired by the intermittent occurrences of epidemic outbreaks in the sparse community A, we further numerically investigate the probability distribution P (H) of the "height" of an epidemic outbreak H, which is defined as the maximal number of infected nodes during the outbreak. The simulation results of P (H) for various values of each of the parameters L, k B , k A , λ, and γ are shown in figure 10, and all decay exponentially. This results from that the frequency distribution P (T i ) of time duration of an outbreak follows an exponential decay, and, the longer the outbreak duration T i the larger the probability that the outbreak has a large height H. Figure 10(a) reveals that the more the inter-community links, the more possible for the epidemic outbreak to climb to a large height. According to the simulation results in figure 5(a), a larger L suggests a longer time duration T i of an outbreak since it is more likely to cause overlaps between successive outbreaks, thus it is easier to obtain a large H. From figure 10(b) we observe that for a fixed value of H, the value of P (H) increases very slightly as k B increases from 12 to 18 and further to 24. It arises from the increment of k B resulting in a slight growth of T i in simulations since the relative small number (L = 4) of inter-community links restricts the inter-community infections, as explained for figure 5(b). Based on (19), we expect a higher frequency P (T i ) for larger values of k A and λ or for a smaller value of γ. Thus, one can expect an increase of P (H) with the increase of k A and λ or with the decrease of γ, both of which is supported by simulation results [see figure 10(c-e)].

Conclusion
We have studied the SIS model in a random network composed of a dense community B and a sparse community A, where the disease persists throughout the dense community while in the sparse community it alternates between temporary epidemic outbreaks and extinctions. The model is particularly relevant for disease transmission from a reservoir population and intermittent outbreak in a secondary group, as is the case with many zoonotic or emerging infectious diseases. We have developed a theoretical framework and performed extensive computational simulations to understand the interesting features of the epidemic dynamics in the sparse community. In particular, we have explored the expected values for the time durations of outbreak and extinction, the time interval between two successive outbreaks, and their frequency distributions. The results demonstrated how these timescales rely on the parameters including the number L of inter-community links, the average node degree k B within community B and k A within community A, the transmission rate λ, and the recovery rate γ. The theoretical results are in good agreement with simulations except when there are too frequent overlaps between successive epidemic outbreaks, since such overlaps extend the time duration of outbreak and make the simulation computing deviate from the theory. We have also found an exponential decay for the frequency distributions of these timescales, as well as for the frequency distribution of the maximal number of infections during an epidemic outbreak. All these results may provide helpful insights for understanding the temporal patterns of an epidemic in a local community and lead to draw up effective time-based preventive strategies. For example, the temporal pattern of intercommunity infections may suggests we implement a periodic vaccination plan on the inter-community nodes in the sparse community after each time period T . More effectively, according to the exponential distribution of the time interval T between successive inter-community infections, we may also consider exponentially distributed vaccination forces ν ∝ exp(−qT ) on the inter-community nodes in the sparse community, with regard to the waiting time T since the first appearance of infection in the dense community.
This paper only considers the simplest network topology, i.e., the ER random graph, which is an important example of homogeneous networks. Therefore, it is of great interest to extend the present work to heterogeneous networks, which needs a deeper study since the heterogeneities in node degrees make it more complicated to make an accurate prediction for the epidemic timescales. It is worth notice that the MF theory provides a good approximation for the prediction of the average timescale of T and its distribution P (T ), which largely benefits from the fact that in the dense ER community, every node has a high degree and so do the nearest neighbours of any given node, where the MF theory generally yield good approximations to the underlying dynamical process [29]. However, as shown in the present work, the ignorance of higher-order correlations with respect to infectious density in the MF equation can lead to a small deviation from the simulations.