Correlations between stochastic endemic infection in multiple interacting subpopulations

Heterogeneity plays an important role in the emergence, persistence and control of infectious diseases. Metapopulation models are often used to describe spatial heterogeneity, and the transition from random-to heterogeneous-mixing is made by incorporating the interaction, or coupling, within and between subpopulations. However, such couplings are difficult to measure explicitly; instead, their action through the correlations between subpopulations is often all that can be observed. We use moment-closure methods to investigate how the coupling and resulting correlation are related, considering systems of multiple identical interacting populations on highly symmetric complex networks: the complete network, the k-regular tree network, and the star network. We show that the correlation between the prevalence of infection takes a relatively simple form and can be written in terms of the coupling, network parameters and epidemiological parameters only. These results provide insight into the effect of metapopulation network structure on endemic disease dynamics, and suggest that detailed case-reporting data alone may be sufficient to infer the strength of between population interaction and hence lead to more accurate mathematical descriptions of infectious disease behaviour.


28
Heterogeneity is an increasingly important feature of epidemiological models, with spatial does not change the basic reproductive ratio, but instead determines the distribution of 129 secondary cases between the P subpopulations. 130 We let S i (t), I i (t), R i (t) ∈ {0, 1, 2, . . .} denote the number of susceptible, infected and 131 recovered individuals, respectively, in population i = 1, 2, . . . , P at time t ≥ 0. As the popu-132 lation size N is constant then S i (t)+I i (t)+R i (t) = N, ∀t ≥ 0, i = 1, 2, . . . , P . The transition 133 rates for the resulting 2P -dimensional Markov chain from state (s 1 , i 1 , s 2 , i 2 , . . . , s P , i P ) at 134 time t are summarised in Table 1. 135 The metapopulation structure can be described by a weighted network G = (V, E) 136 with vertex set V = {1, 2, . . . , P } and edge set E, where edge e = ij has weight σ ij : the 137 coupling matrix Σ therefore represents the weighted adjacency matrix for the graph G. 138 For mathematical tractability we restrict our analysis to networks for which we can derive 139 analytic results, namely graphs that are highly symmetric; a discussion of the effect of 140 relaxing this assumption is provided in the Supplementary Information. In the following 141 analysis we consider the complete network, the k-regular tree network and the star network. 142 In addition, we assume that σ ij = σ, ∀ij ∈ E. We note that for k-regular tree network and 143 the star network, the weighted adjacency matrix Σ is sparse, that is, most of the elements 144 are zero. Transition Rate j = 1, 2, . . . , P Infection s j → s j − 1, i j → i j + 1 βs j l σ jl i l /N + s j Recovery i j → i j − 1, r j → r j + 1 γi j Death of infected s j → s j + 1, i 1 → i j − 1 µi j Death of recovered s j → s j + 1, r j → r j − 1 µ(N − s j − i j ) Table 1. A summary of the transition rates of the 2P -dimensional Markov chain endemic infection model {(S j (t), I j (t)) P j=1 : t ≥ 0} from state (s 1 , i 1 , s 2 , i 2 , . . . , s P , i P ) with birth/death rate µ > 0, contact rate β > 0, external import rate > 0, recovery rate γ > 0 and coupling matrix Σ. infectious disease modelling Lloyd, 2004).

153
Due to the nonlinearity of the infection term in the model, the ODE for an nth-order 154 moment will depend on one or more (n+1)th order moments: to fully describe the stochas-155 tic process would therefore require an infinite set of ODEs. To circumvent this problem, 156 we use a moment closure approximation, which truncates the set of ODEs at some order.

157
Throughout this paper, we use a second-order moment closure approximation, which as-

158
sumes that third-and higher-order cumulants are equal to zero. As a result, third-and 159 higher-order moments can be written in terms of the means, variances and covariances 160 only.

161
Throughout this paper we will use the following notation for the first-and second-order 162 central moments: For a metapopulation network on P populations, the set of ODEs approximating the 164 stochastic process has at most 3P 2 + 2P equations: P for each of the two first order 165 moments and P 2 for each of the three covariances. However, for the networks that we 166 consider in this paper, symmetries in the structure of the network mean that the number 167 of ODEs is considerably fewer. In some cases we will simplify the notation: we outline 168 simplifications to the notation at the start of the results section for each network.
We derive an approximate equation for the correlation ρ ij by considering the ODE 176 for the covarianceĈ I i I j at endemic equilibrium. We then evaluate our approximation 177 numerically, for which we need to define a set of base parameters. We utilise parameters  and between any pair of populations. In our notation, we can therefore drop dependency 198 on the population and simplify it to the following:X = E[X j ], C XY = Cov(X j , Y j ) and Using the second-order moment closure approximation, and with these simplifications, 201 the stochastic process on the complete network can be approximated by a set of eight ODEs: 202 five for the within-population moments, and three for the between-population moments.

203
These can be found in the Supplementary Information. We use these equations in both 204 the analytic and the numerical results. and show that this is equal to and 211 ∆ = we can further simplify the approximation for the correlation to the following expression: We can also use an alternative approximate expression for ξ that is independent of 216S * , which eliminates the need to find the equilibrium of the 8-dimensional ODE model. 217 Meakin and  show that by ignoring the effects of imports and correlations 218 and taking the large population limit, then Given the simpler form of Equation (6) compared to the original expression for ξ given by   Next, we consider infinitely many populations on a k-regular tree network, where each 257 subpopulation has k neighbours: a visualisation of the k-regular tree network for k = 2 258 and k = 4 neighbours is given in Figure 4. The coupling matrix Σ = (σ ij ) is defined as otherwise.  . Comparing analytic and numerical correlation between any pair of populations from P = 3, 5 populations arranged on the complete network. We compare the analytic correlation ρ and our approximation σ/(ξ + σ), ξ = 0.0625, to stochastic simulations for a measles-like endemic disease in the UK (N = 10 5 , µ = 5.5 × 10 −5 , R 0 = 17, γ −1 = 13 and = 5.5 × 10 −5 ). Each population is coupled to the k = P − 1 other populations. The between-population coupling is fixed as σ ∈ [0, 1/k] and within-population coupling is therefore 1 − kσ. We generate 1000 realisations of the process for each value of σ and calculate the correlation as a time-weighted Pearson correlation coefficient for 50 ≤ t ≤ 200; error bars represent ±2 standard deviations. As with the complete network, all subpopulations in the k-regular tree network are epidemiologically and spatially identical, so the expected behaviour is the same within all 262 subpopulations. In addition, in a tree network, there is a unique path between any pair of 263 subpopulations, and so we can define the distance d ij ∈ N between subpopulations i and 264 j to be the length of the path between the subpopulations. For the notation for within-265 population moments we can again drop dependency on the subpopulation: For the between-population moments, we only need to denote the 267 distance d between the subpopulations:Ĉ Finite subgraph approximation of the k-regular tree network We cannot per- ODEs that approximate the stochastic process on the network, but this system comprises 272 infinitely many equations: five equations for the within-population moments, and infinitely 273 many equations for the between-population moments (3 for each d ≥ 1).

274
To overcome these problems, we consider a finite subgraph of the k-regular tree net- We can also write down a finite set of ODEs that approximate the stochastic process on 281 the D-truncated k-regular tree network. If D is sufficiently large, then we can make some 282 further simplifying assumptions. First, we can assume thatĈ and We derive this result from the moment equation forĈ where ρ 0 = 1 and lim d→∞ ρ d = 0. Since |ρ d |≤ 1 then the solution is given by We note two things: firstly, since ρ 1 ≤ 1 then it is trivial that ρ d → 0 as d → ∞.
We note that the MVN correlation and stochastic simulations have to be performed on 307 the D-truncated k-regular tree network, as it is not possible to use the full k-regular tree 308 network. If D is sufficiently large, then these correlations will be approximately the same as 309 in the full k-regular tree network: we show that for D sufficiently large then the correlation 310 converges ( Figure S2, Supplementary Information). 311 We first numerically evaluate the effect of the number of neighbouring subpopulations Information, Figure S2). There is also little difference between the MVN correlation and 325 our approximation (that is, ∆ (1) is small) and so approximating the MVN correlation by 326 Equation (13) is reasonable.
(14) only one neighbour. However, we can still make some simplifications to the notation: the 338 expected behaviour of the infection process is the same within any leaf subpopulation, 339 or between any pair of leaf subpopulations, or between a leaf subpopulation and the hub 340 subpopulation. We can therefore simplify our notation to distinguish between hub and leaf 341 subpopulations. For the within-population moments, the superscript indicates whether the 342 subpopulation is a hub (H) or a leaf (L) subpopulation: For the between-population moments, the superscript indicates whether one of the subpop-344 ulation is a hub (H) or if they are both leaf subpopulations (L); forĈ S i I j we distinguish 345 betweenĈ S 1 I i andĈ S i I 1 : Using the second-order moment closure approximation, the stochastic process on the 347 star network for P subpopulations can be approximated by a set of seventeen ODEs: 348 ten equations for the within-population moments, and seven equations for the between-  We can show that ρ H and ρ L are solution to the following pair of simultaneous equations: where and (20) We derive this result by taking the moment equation forĈ H II andĈ L II at equilibrium; full

365
We first numerically evaluate the effect of the number of leaf subpopulations k on the 366 correlations ρ H and ρ L (Figure 8). Firstly, we note that, as with the complete and tree 367 network, both ρ H and ρ L exhibit a sigmoidal shape, increasing from zero correlation from 368 very low coupling. Secondly, the correlation between two leaf nodes is lower than between 369 the hub and a leaf node; this is to be expected, as the leaf nodes are not directly connected  populations, the k-regular tree network, and the star network with P = k + 1 populations. 383 We observe that the correlation is highest in the complete network and lowest in the tree 384 network. Moreover, the difference between the approximations increases as k increases. 385 We attribute this behaviour to the total number of neighbour subpopulations that the 386 two focal subpopulations have, how many of those neighbours are common neighbours, 387 and whether these common neighbours interact. As the total number of neighbours of 388 each member of the focal pair increases then the correlation decreases; for a given total 389 number of neighbours the correlation is higher when more of these neighbours are common 390 between the two focal subpopulations, and is higher yet when these common neighbours 391 also interact with each other.
392 Figure 9. Comparing the analytic correlation, ρ H and ρ L , and our approximation to stochastic simulations for a measles-like endemic disease in the UK in P +1 populations arranged on the star network (N = 10 5 , µ = 5.5 × 10 −5 , β = 17/13, = 5.5 × 10 −5 , γ = 1/13). The between-population coupling is fixed as σ ∈ [0, 1] and within-population coupling is therefore 1 − σ in the hub population and 1 − σ in any leaf population. The stochastic process is simulated over a 200 year period using the Gillespie algorithm, with a burn-in period of 50 years, and generate 1000 realisations of the process for each value of σ. The correlation is calculated as a time-weighted Pearson correlation coefficient for 50 ≤ t ≤ 200; error bars represent ±2 standard deviations.