Improving pairwise approximations for network models with susceptible-infected-susceptible dynamics

Network models of disease spread play an important role in elucidating the impact of long-lasting infectious contacts on the dynamics of epidemics. Moment-closure approximation is a common method of generating low-dimensional deterministic models of epidemics on networks, which has found particular success for diseases with susceptible-infected-recovered (SIR) dynamics. However, the effect of network structure is arguably more important for sexually transmitted infections, where epidemiologically relevant contacts are comparatively rare and longstanding, and which are in general modelled via the susceptible-infected-susceptible (SIS)-paradigm. In this paper, we introduce an improvement to the standard pairwise approximation for network models with SIS-dynamics for two different network structures: the isolated open triple (three connected individuals in a line) and the k-regular network. This improvement is achieved by tracking the rate of change of errors between triple values and their standard pairwise approximation. For the isolated open triple, this improved pairwise model is exact, while for k-regular networks a closure is made at the level of triples to obtain a closed set of equations. This improved pairwise approximation provides an insight into the errors introduced by the standard pairwise approximation, and more closely matches both higher-order moment-closure approximations and explicit stochastic simulations with only a modest increase in dimensionality to the standard pairwise approximation.


Introduction
The spread of any epidemic can be conceptualised as a process on a network, where individuals are represented as nodes and epidemiologically relevant contacts as edges between nodes. An abundance of different network-based approaches to disease spread have been developed over the years, varying in scope, application, and sophistication. These range from, at one extreme, Markovian state-based models, where the probability of a system being in a certain state is given exactly by its master equations (see Kiss et al., 2017 for an introduction to such methods), to explicit stochastic simulations of epidemics on networks (see Goodreau et al., 2017 andWhittles et al., 2019 for recent examples) at the other. Both approaches have limitations. The exponentially increasing state-space with network size for state-based models mean these exact descriptions are computationally unfeasible for most networks of real-world interest; and while stochastic simulations can deal with networks of these sizes, such methods offer lit-tle or no analytical tractability, making sensitivity to network structure hard to quantify and the causal determinants of the resulting dynamics hard to identify.
One network approach that aims to bridge this gap is momentclosure approximation. In a population, the rate of change of the number of infected individuals will depend upon how many susceptible-infected pairs there are. The rate of change of these pairs, in turn, depends upon the number of triples, and so on up to the full size of the population. Moment-closure approximation methods obtain a closed set of ordinary differential equations (ODEs) for the disease dynamics by approximating the dynamics of higher-order moments (e.g. triples) in terms of lower-order moments (e.g. singletons and pairs). By doing so, one obtains a relatively simple ODE model that retains much of the tractability of mean-field approximation models (the standard approach to modelling the spread of infectious diseases) but that also explicitly accounts for some aspects of network structure. Hence, there has been much interest and research into such methods, and into the errors such approximations introduce into a model (Keeling et al., 2016;Pellis et al., 2015;Sharkey, 2011;Taylor et al., 2012).
There has been considerable progress in this moment-closure method for diseases that can be modelled via the susceptibleinfected-recovered (SIR) paradigm: the determinants of errors in such methods are detailed by Sharkey (2011); the exactness of a closure at the level of triples for tree-like networks is proven by Sharkey et al. (2015); this framework is extended by Kiss et al. (2015) to more realistic network structures that include loops; (Trapman, 2007) defines a reproduction number for pairwise approximation; (House, 2015) provides an algebraic momentclosure for such diseases based on Lie algebraic methods; while (Pellis et al., 2015) explore the exactness of closures when infective periods are of a constant duration.
By comparison, progress has been modest for diseases with susceptible-infected-susceptible (SIS) dynamics, equivalent to the network-based contact process (Liggett, 2013), where recovery from infection does not lead to immunity. Despite its lower dimensionality than the SIR model, the possibility of reinfection can cause correlations between indirectly connected individuals to accrue over time. Consequently, moment-closure approximations on networks with SIS-dynamics are in general not exact, and their analytical tractability is limited. Of the progress that has been made: important formal results on their derivability from exact state-based models have been achieved by Taylor et al. (2012), Taylor and Kiss (2014) and Keeling et al. (2016) compare three systematic momentclosure approximations against stochastic simulations; (House et al., 2009) develop a motif-based approach that outperforms simpler methods for particular network topologies; while  develop a compact pairwise approximation that agrees well with ODE models of a much higher dimensionality.
Capturing network structure is at its most important when edges between nodes are sparse but relatively long lasting. This, alongside the more well-defined nature of epidemiologically relevant contacts, means that moment-closure methods are potentially most valuable for understanding the spread of sexually transmitted infections (STIs). However, most STIs are modelled using the SIS-paradigm (though notably not HIV). Thus, both understanding the errors introduced by moment-closure approximations for diseases with SIS-dynamics, and improving upon these approximations, is vital for the successful application of such methods to public-health problems.
In this paper, we introduce improvements to the standard pairwise approximation for diseases with SIS-dynamics. In particular, we do this for the isolated open triple and for k-regular networks, by explicitly obtaining equations for the rates of change of the errors between triples and their standard pairwise approximation. By applying a closure to these equations, we obtain a closed set of equations that better approximate the true dynamics of infection, with only a modest increase in dimensionality. In the case of the isolated open triple, such a model is exact, while for k-regular networks, closures at the level of order-four structures have to be applied. Specifically, in Section 2 we discuss the isolated open triple, obtaining exact expressions for the appropriate errors and their rates of change, thus obtaining an exact set of equations describing the disease dynamics on this network topology. In Section 3, we use the results from the isolated open triple to inform our improved approximation on k-regular networks, i.e. networks with no loops and where each individual has k neighbours. We consider both higher-order moment-closure approximations and explicit stochastic simulations for this type of network, to act as benchmarks for our improved pairwise approximation. In Section 4, we compare this improved approximation to the standard pairwise approximation, to higher-order approximation models, and to stochastic simulations. In Section 5, we discuss some of the limitations to such an approach, and highlight some potential areas where we believe further research could be fruitful.

The isolated open triple
In this section, we consider the errors introduced by performing pairwise approximation on isolated open triples for a disease with SIS-dynamics. We define an isolated open triple as a central individual c connected to two neighbouring individuals x and y, where x and y remain unconnected, as illustrated in Fig. 1. By investigating this topology, the errors introduced by a pairwise approximation are not obfuscated by errors introduced from any external source, and exact results using the master equation approach (Kiss et al., 2017) can be generated.
We consider a diseases with SIS-dynamics, that is, upon recovery from infection (I) an individuals returns to the susceptible (S) class. We can described this process on the 3-network in terms of its states, of which there are eight -corresponding to whether each individual belongs to the S or I class -so a particular state A 2 fS; Ig 3 . We denote the probability of being in a certain state Pðx ¼ X; c ¼ C; y ¼ YÞ as ½X x C c Y y , where X; C; Y 2 fS; Ig. If we consider recovery from infection, c, and transmission across partnerships, s, to be Poisson processes, then the above situation is a continuous-time Markov process, and can be fully described by its Master equations (see Kiss et al., 2017, Chapter 2 for an introduction to this approach).
We set initial probabilities of each state by assuming random initial conditions, i.e. by taking I 0 $ Uð0; 1Þ and setting ½I x I c I y 0 ¼ I 0 Â I 0 Â I 0 and so on. Note, under this assumption, we have the symmetries ½S x S c I y ¼ ½I x S c S y and ½S x I c I y ¼ ½I x I c S y . Thus, the dynamics of the isolated open triple are fully and exactly described by the following six ODEs: _ ½I x S c I y ¼ cð½I x I c I y À 2½I x S c I y Þ À 2s½I x S c I y ð 5Þ _ ½I x I c I y ¼ À3c½I x I c I y þ 2sð½S x I c I y þ ½I x S c I y Þ Note that the disease-free state ½S x S c S y is absorbing, and so given long enough this system will always evolve to this state. Hence, without an external source of infection, a disease cannot persist indefinitely with an isolated open triple (or indeed, within any isolated graph of finite topology). If we wish to consider initial conditions that do not assume random mixing, e.g. pure initial conditions, eight equations are required. These are given in full in Appendix A.

The pairwise approximation for the isolated open triple
We now introduce the pairwise approximation for the open triple. It is important to note that we are considering a local momentclosure approximation, i.e. we are tracking the dynamics and errors introduced for a particular subgraph, as opposed to a global moment-closure approximation, where we apply closures at a population level.
We begin by considering equations for the probability of individuals (nodes of the open triple) being in a certain state A 2 fS; Ig, where we denote Pða ¼ AÞ as ½A a . ODEs describing the rate of change of these states can be obtained by summing the rates of change from the appropriate triples, e.g.
We observe that the state of an individual depends on the probability of pairs of individuals being in certain states: we denote Pða ¼ A; b ¼ BÞ as ½A a B b and also obtain these by summing the appropriate triples. We arrive at the following equations: where ½I a ¼ 1 À ½S a and ½I a I b ¼ ½I b À ½S a I b ¼ ½I a À ½I a S b . Thus, we see that the rate of change of the probability of the infection status of individuals depends on the infection status of certain pairs, which themselves depend on the infection status of certain triples. This set of equations is unclosed, as we do not have expressions representing the time evolution of the disease status of these triples. Typically, studies have obtained a closed set of equations by assuming that the infection status of individuals x and y are conditionally independent given the infection status of individual c (Sharkey, 2008;Sharkey, 2011;Pellis et al., 2015). That is, we make the following assumption: Observing that ½S x S c ¼ ½S x À ½S x I c , and that ½S c ¼ ½S x S c À ½I x S c , we obtain a closed set of three equations, which we refer to as the pairwise approximation for the isolated open triple, given in full below: Model 2 -The pairwise approximation for the isolated open triple

Quantifying errors
We can now compare the pairwise approximation model  to the exact model (Eqs. 1-6). The approximate model captures the dynamics of the system at low values of the transmission rate s, but if s is sufficiently high, the approximate model behaves qualitatively different to the exact model -there is no absorbing state, and we have a non-zero stationary probability of individuals being infected (Fig. 2). While in Model 1 ½S x S c S y never decreases, in Model 2 its approximation ½S x S c ½S c S y =½S c can decrease. This decrease occurs because of the rate of change of ½I c to ½S c . In Model 1, this transition only affects ½S x S c S y from the state ½S x I c S y , which only ever increases the probability of ½S x S c S y . However, in Model 2 the decoupling of the two pairs and single means that this transition, with certain within pair correlations, can lead to a decrease in ½S x S c ½S c S y =½S c .
Comparing the exact value for triples with their approximation at any given time, we observe this approximation underestimates the probability of the state ½I x S c I y , and overestimates the probability of the state ½S x S c I y . Indeed, the underestimate of ½I x S c I y is exactly the overestimate of ½S x S c I y (Fig. 3).
To understand why, consider the quantities a ½SxSc Iy :¼ ½S x S c I y ½S c À ½S x S c ½S c I y and a ½IxSc Iy :¼ ½I x S c I y ½S c À ½I x S c 2 , borrowing notation from Sharkey et al. (2015), which quantify the difference between triples and their approximations. By expanding ½S c ¼ ½S x S c S y þ 2½S x S c I y þ ½I x S c I y ; ½S x S c ¼ ½S x S c S y þ ½S x S c I y ; and ½S c I y ¼ ½S x S c I y þ ½I x S c I y and cancelling the appropriate terms, we arrive at the fact that both quantities are equal but opposite in sign, and thus we now define a S as: a S ¼ a ½IxSc Iy ¼ Àa ½SxSc Iy ¼ ½S x S c S y ½I x S c I y À ½S x S c I y 2 Noting further that a ½SxSc Sy ¼ a S , while clearly a ½IxSc Sy ¼ a ½SxSc Iy ¼ Àa S , we observe that the difference between true and approximate triple values for all triples with susceptible central individuals depends upon one quantity a S . Similarly, the difference between true and  approximate triple values of all triples with infected central individuals depends only on one quantity, which we denote a I :

Improving the pairwise approximation
If instead of using the approximations from Eq. (11), we let Eq.
(9), and let ½I x S c I y ¼ ð½I x S c 2 þ a S Þ=S c in Eq. (10), we obtain the rates of change of pairs in terms of singletons, pairs, and a S . To obtain a closed set of equations, we must consider _ a S , where the rates of change of triples can be obtained from the exact model.
where / S ¼ ½S x S c S y ½I x I c I y þ ½S x I c S y ½I x S c I y À 2½S x S c I y ½S x I c I y ð19Þ Thus, the rate of change of a S depends in turn on the rate of change of a I , which is given by: where We insist that / S and / I are 0 if either ½S c ¼ 0 or ½I c ¼ 0. Using the above equations, we arrive at a closed set of equations that describes exactly the disease dynamics of the open triple, without any reference to the particular states of triples themselves, by tracking the error terms a S and a I . Model 3 below describes in full this improved pairwise model, with / S and / I described as above: _ ð 28Þ _ a I ¼ À4ca I þ 2sð/ I À a I Þ; with / I as in Eq:ð24Þ ð 29Þ By including a S and a I and their time-evolution in Model 3, we obtain a system of ODEs that describes exactly the dynamics of the open triple. However, it is worth noting that this new model is of no lower dimensionality than Model 1. Despite this, we believe this is still a valuable model to have obtained explicitly. There are two principal reasons for this: firstly, by creating a system where errors a S and a I are tracked explicitly, we can obtain results and gain an understanding about the ways in which the standard pairwise approximation (which ignores the action of a S and a I ) fails to capture the disease dynamics of the isolated open triple; and secondly, the derivation of this model informs our strategy of how to derive an improved pairwise approximation for k-regular networks, where there is a significant reduction in dimensionality.
Upon numerical evaluation, interesting results about the error terms a S and a I arise. When considering the whole state space, both error terms can be either negative or positive (a S ; a I 2 ½À1=4 1=4). However, this is not the case when starting from either random or pure initial conditions; in both scenarios, a S P 0 and a I 6 0. This is numerically demonstrated in Appendix B. Consequently, assuming random or pure initial conditions, we arrive at the following bounds: Of these, the bound ½I x I c 2 =½I c P ½I x I c I y is of particular interest. In previous moment-closure studies, it has been suggested heuristically that moment-closure models underestimate the probability of ½I x I c I y triples (Taylor et al., 2012). This does hold if the system is closed at the level of individuals, i.e. if we assume that the infection status of neighbours are independent. The above result demonstrates that the opposite is true if the system is closed at the level of pairs: For random initial conditions, a S and a I appear to be uniquely defined by the pairs ½S x S c and ½S c I y , in other words a S and a I appear to be functions of ½S x S c and ½S c I y . In theory, given values of ½S x S c and ½S c I y , one could determine the values of a S and a I exactly, consequently reducing the dimensionality of Model 3, as equations for their time evolution would no longer be necessary. As a S and a I appear to be functions of two variables, they can be represented visually as surfaces, with ½S x S c and ½S c I y as x and y-axes, and with a S or a I as the z-axis. Included in supplementary material are animations of the evolution of the shape of these surfaces as we increase s. These animations confirm the above bounds. overestimates the probability of ½SxIcSy. In both cases, the overestimate of one is equal to the underestimate of the other. In both plots we set s ¼ 1; c ¼ 1.

k-regular networks
In Section 2, we considered the accuracy of the standard pairwise approximation on the isolated open triple, and derived a closed exact set of equations describing the errors such an approximation makes. We could do so because we could compute exactly the probability of the states of the open triple (Model 1), and working backwards we could derive expressions for _ a S and _ a I solely in terms of ½S x S c , ½S c I y , a S , and a I -i.e. solely in terms of pairs and errors terms. Informed by these results, we move on to consider pairwise approximations for k-regular networks. k-regular networks are defined as networks in which each individual has k neighbours. Here, we consider k-regular networks which are infinite and contain no loops 1 . Being infinite, the disease dynamics on such a network cannot be described exactly by a closed set of ODEs, unless a closure at some level is exact, as in Sharkey et al. (2015) for diseases with SIR-dynamics. As stated previously, the possibility of reinfection induces correlations between distantly connected individuals, meaning the method used by Sharkey et al. (2015) is not successful for diseases with SIS-dynamics. However, one can close the system at a higher level than pairs and by doing so, we can obtain expressions for _ a S and _ a I solely in terms of pairs and error terms. While these are still approximations to the true disease dynamics on a k-regular network, doing so makes a considerable improvement on the standard pairwise approximation. This is the strategy we employ in this section. While these k-regular networks are clearly idealisations far removed from any real-world sexual network, we believe that they are a useful example to study for a number of reasons. The impact of a small number of contacts, and the resulting dynamical correlations between non-adjacent individuals, is still relatively poorly understood (Keeling et al., 2016). In these idealised networks, the errors such correlations introduce into moment-closure approximations are at their most pronounced, and are not muddied by errors introduced from other sources, such as clustering or heterogeneity. While heterogeneity in the number of contacts individuals have is apparent in any real-world sexual network, and is important to capture when modelling STIs, the effect of heterogeneity has been studied extensively (Eames et al., 2002;Simon and Kiss, 2015), and can oftentimes be modelled by introducing multiple risk-groups into a mean-field approximation model (e.g. Edwards et al., 2010). Additionally, in the case of an infinite network, each individual has exactly the same properties, allowing us to bridge the gap from local to global moment-closure approximation.
In this section, we define global moment-closures for k-regular networks. That is, we define a closure in terms of populationlevel quantities rather than for the probabilities of particular individuals being in certain states. Accordingly, we use the notation ½S to represent the proportion of individuals who are susceptible, ½SI to represent the proportion of pairs where one individual is susceptible and one individual is infected, and so on. While it is standard within the moment-closure literature to refer to numbers of these quantities, we find that dealing with proportions avoids much of the combinatorial rigmarole involved, and has a more obvious correspondence with the methods described in Section 2. The following results hold true whether referring to proportions or numbersin Appendix C, we provide a conversion table to transform the results from this section to numbers, and provide the model derived in this section in terms of numbers.
While the derivation of this moment-closure is independent to that of the previous section, and can be treated as a separate mod-elling exercise, we will observe that there are clear analogies between the two. This correspondence occurs because k-regular networks are isotropic -number of partnerships, as well as transmission and recovery rates, are homogeneous across the population. An alternative conceptualisation is that if we were to randomly sample one individual (or a higher-order motif) from a k-regular network, the probability of it being in a given state is directly equal to the proportion of the population in that state. Conversely, if we consider a population of infinitely many isolated open triples from Section 2, then the proportion in a given state is equal to the probability of one triple being in that state. Therefore while Section 2 is formulated in terms of probabilities and Section 3 is formulated in terms of proportions, we are effectively modelling interchangeable quantities.

Mean-field and pairwise approximations for k-regular networks
The following equation describes the rate of change of ½S for any network (Simon et al., 2011): In the case of k-regular networks, k ¼ ks. By assuming the disease status of constituent individuals in pairs are uncorrelated, i.e. ½SI % ½S½I, we arrive at the mean-field approximation for the k-regular network, which is equivalent to the standard SIS-model: Model 4 -The mean-field approximation for k-regular networks If instead we want to close the system at a higher-order moment, we must consider the rate of change of ½SI: _ ½SI ¼ cð½II À ½SIÞ À s½SI þ ðk À 1Þs½SSI À ðk À 1Þs½ISI ð 32Þ To close this system of equations, we must approximate the proportion of triples ½SSI and ½ISI. We use the standard pairwise approximation of Rand (1999) and Keeling, 1999, commonly attributed to Kirkwood (1935). Using straight line brackets to denote numbers of individuals, etc. this is expressed as: When terms are expressed in terms of numbers this must be scaled by the factor ðk À 1Þ=k; this scaling factor disappears for k-regular networks when expressed in terms of proportions. This can be shown by converting either formulation of the approximation to the other using the conversion table provided in Appendix C. Using this approximation, we obtain: Model 5 -The pairwise approximation for k-regular networks where ½S ¼ ½SS þ ½SI, ½I ¼ 1 À ½S, ½IS ¼ ½SI, and ½II ¼ 1 À ½SS À 2½SI.

Improving pairwise approximations for k-regular networks
Once again, we can look to improve the pairwise approximation by considering the rate of change of triples. Reintroducing subscripts (the position of individuals is illustrated in Fig. 4), the state of x À c À y triples depend upon topologies consisting of four connected individuals: line graphs of length 4 ½A a X x C c Y y and ½X x C c Y y B b , capturing the external force of infection acting upon 1 diseases with SIS-dynamics on k-regular networks have been studied before are referred to in the theoretical literature as the contact process on the homogeneous tree T kÀ1 (Liggett, 2013).
individuals on the periphery of the triple, and star graphs with three outer individuals, ½X x C c Y y Z z , capturing the external force of infection upon the central individual.
The rates of change for the triples in an infinite k-regular network, derived from House et al. (2009), are given in Appendix D.
As before, we define a values as the difference between triple values and their standard pairwise approximation. Once again, the following relations hold: Thus, as for the isolated open triple, the difference between triple values and pairwise approximations depend only upon two quantities: a S and a I , which are as defined in Eqs. (15) and (16). We can use the triple equations from Appendix D to obtain expressions for _ a S and _ a I for this type of network: À ðk À 2Þð2½S x S c I y I z ½SII À ½I x S c I y I z ½SIS À ½S x S c S y I z ½IIIÞ Despite being calculated for triples within a k-regular network, we find that U S ¼ / S and U I ¼ / I as previously defined for the isolated open triple in Eqs. (20) and (24), and so use the / S and / I terms henceforth. We therefore obtain a closed set of equations by once again setting But now we must also make some approximation for order four terms. We do this by making the following closures: Thus, we can again express _ a S and _ a I as (complicated) functions of ½SS; ½SI; a S and a I . Using this, we arrive at a system of four ODEs, which we call the improved pairwise approximation for k-regular networks: Model 6 -The improved pairwise approximation for kregular networks _ ½SI ¼ cð½II À ½SIÞ À s½SI þ ðk À 1Þs ½SS½SI À a S ½S ð49Þ À ðk À 1Þs where / S and / I are defined as in Section 2.

Higher-order moment-closure approximations
To assess the accuracy gained by modelling the error terms a S and a I , we compare our model to higher-order moment-closures. The first of these we refer to as a neighbourhood closure, previously described by Lindquist et al. (2011) and Keeling et al. (2016), where we model a central individual and their number of infected neighbours explicitly. This system is described by 2 Â ðk þ 1Þ ODEs. The second of these we refer to as an extended triple closure, where we explicitly model a central triple and every neighbour of this triple. This system is described by 2 3kÀ1 equations (though its dimensionality can be reduced by accounting for symmetries). In both cases, we approximate the external force of infection on outer individuals by exploiting the symmetry of the topology of the k-regular network. While each model is still an approximation towards the true dynamics of a k-regular network, in virtue of closing the system at a higher order, these models are expected to have a greater accuracy. From these higher-order models, we can also obtain estimates of the terms a S and a I , with which we can compare the a terms obtained from the improved pairwise model for the k-regular network (Model 6).

The neighbourhood closure
For the neighbourhood closure, we model a central individual and their number of infected neighbours explicitly. Visually then, we are modelling a star topology. The rate of change of state of the 'star' will depend upon both the internal configurations and the immediate neighbours of the star. We show this visually in Fig. 5. To close this system of equations, we make the assumption that the configuration of two overlapping 'stars' are conditionally independent given the infection status of the two shared individuals of the combined configuration. As we only need to consider the effect of an external force of infection if the relevant neighbour is susceptible (S), there are only two quantities relevant to the external force of infection on that individual, depending on the infection status of the original central individual (S or I), which denote k S and k I accordingly. These terms are constructed by summing all configurations of the external neighbours including an infected individuals, multiplied by the number of infected external neighbours in that configuration, divided by the sum of all possible configurations of external neighbours. Denoting a central individual in state A 2 fS; Ig with i 2 f0; 1; . . . :; kg infected neighbours as ½A i , the neighbourhood model can thus be described by the following set of equations: Model 7 -The neighbourhood approximation model for kregular networks where k S and k I are given by: To obtain estimates for a S (and a I ) from this model, we must derive the proportion of triples implied by the assumptions of the neighbourhood model. This can be calculated as follows. For a given triple ½XCY, we let l indicate whether X and Y are infected.
3.3.2. The extended triple closure For the extended triple closure, we model a triple and each of its neighbours explicitly, requiring 2 3kÀ1 equations. The state of this system will depend on the infection status of the neighbours of these neighbours, i.e. the rate of change of states in the extended triple depend upon order 4k À 2 configurations (illustrated in Appendix E). We approximate these external forces on the extended triple by assuming that the state of these higher-order structures amount to overlapping extended triple topologies conditionally independent given the state of their shared individuals, of which there are 2k. A detailed explanation of the extended triple closure model is provided in Appendix E.

Stochastic simulations
We use explicit stochastic simulations as our final benchmark for the accuracy of our approximate models. It is not computationally possible to construct infinite loopless networks for simulations. Instead, large random graphs where each individual has k neighbours can be constructed using the Molloy-Reed algorithm (Molloy et al., 1995), which should behave similarly for very large network sizes. We use the methods outlined by Keeling et al., 2016 to remove short loops and to efficiently calculate the quasiequilibrium prevalence of infection.  Here we illustrate the external force of infection on a neighbourhood in the neighbourhood approximation for k-regular networks, for the example of k ¼ 3. Shaded blue is our triple of interest, shaded in orange are any additional individuals that are modelled explicitly, while shaded in white are individuals not explicitly modelled who exert a force of infection on the explicitly modelled neighbourhood. In this approximation, we model a central individual c, and the number of infected neighbours c as (here shown by x, y, and z). The external force of infection on the explicitly modelled neighbourhood will depend upon order-six structures: ½XxCcYyZzX0 x0 X1 x1 , ½XxCcYyZzY0 y 0 Y1 y 1 , and ½XxCcYyZzZ0 z0 Z1 z1 . To close the system, we make the approximation that, e.g. ½XxCcYyZzX0 x0 X1 x1 % ð½XxCcYyZz Â ½XxCcX0 x0 X1 x1 Þ=½XxCc. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Comparing models
In this section, we compare the previous described k-regular network models; in order of dimensionality, these are: the meanfield approximation model (Model 4), the pairwise approximation model (Model 5), the improved pairwise approximation model (Model 6), the neighbourhood approximation model (Model 7), and the extended triple approximation model. As we are considering a disease with SIS-dynamics, the models evolve to an endemic prevalence of infection (given a sufficiently high transmission rate) -we use this as the primary metric for model comparison. All of these models are approximations of the true system, where there are infinitely many individuals, but we expect as we increase the dimensionality of approximation we also increase the accuracy of the model. We compare all approximate models to explicit stochastic simulations on networks of 10,000 individuals.
In Fig. 6, we compare the endemic prevalence generated by the four models that do not explicitly model a to stochastic simulations -the improved pairwise approximation (which utilises the dynamics of a) is considered in Figs. 7 and 8. While we notice large differences between mean-field and pairwise models, the difference in prevalence between models decreases as we increase the dimensionality of the model. For k ¼ 3, there is little difference between the neighbourhood and extended triple approximation models, and there is excellent agreement between the extended triple model and stochastic simulation. For k ¼ 4 and k ¼ 10, the extended triple model is omitted, as the neighbourhood approximation models match closely to stochastic simulations. This indi-cates that including further complexity into a model may be unnecessary, or may not be worth the increasing complexity or computational expense. For k ¼ 2, there is still a significant difference between simulation and the extended triple model. However, this is unsurprising, as previous research (Keeling et al., 2016) has shown that errors persist when much larger neighbourhoods are modelled explicitly. Fig. 6 also illustrates that as we increase k, models tend towards the mean-field approximation (which can be considered the k ! 1 limit). In Appendix F, we provide a proof of this for the pairwise model, and outline how this would be proved in the general case. We also see that as we increase k, the difference between pairwise and neighbourhood approximation models decreases, although the pairwise model consistently predicts higher endemic prevalences. Now, we turn our attention to the improved pairwise approximation (Model 6), which tracks the errors a S and a I explicitly. Here we focus on the examples k ¼ 2 and k ¼ 3, though comparable results are found for all higher values of k. The error in our pairwise model depends on only one term: a S . This term captures the error between the 'true' value of triples and the standard pairwise approximation of their values. We can obtain estimates for a S from each of our higher-order models, noting that the improved pairwise approximation (Model 6) is based on consideration of four connected nodes. Comparing a S between models allows us to assess the extent to which the improved pairwise approximation is successful in capturing the errors introduced to the pairwise approximation induced by dynamics of higher-order structures.
By plotting a S as a function of the pairs ½SS and ½SI we obtain surfaces; their shape informing our intuition of the behaviour of a S as we move through ð½SS; ½SIÞ-space. Firstly, we observe that the numerical result a S P 0that was true for the isolated open triple also holds true for each of these models (numerically demonstrated in Appendix G). Hence, the bounds obtained for triples ½S x S c S y ; ½I x S c I y ; and ½S x S c I y in Section 2 for the isolated open triple also hold for k-regular networks. Secondly, we observe the similarity between a S surfaces obtained from the improved pairwise and neighbourhood approximation models. We do, however, see these are smaller than a S from the extended triple. In other words, models that include higher-order correlations, such as the extended triple, have higher values of a S than are obtained from the improved pairwise model. Comparing the prevalence of infection obtained from these models, we observe only a minor difference between improved pairwise and neighbourhood approximations. By including just two more equations (for a S and a I ), we arrive at a model with an endemic prevalence much closer to results obtained from stochastic simulation, with only a marginal increase in dimensionality.
Unlike the isolated open triple, a I can be positive when k > 2 in each of the approximate models. However, this only occurs at very high transmission rates -typically when endemic prevalence I Ã > 0:8 (Appendix G).
In an attempt to further improve the accuracy, and to reduce the dimensionality, of the model, we consider the effect of ignoring a I on the shape of a S in the improved approximation; noting that the values of a S from the extended triple approximation are consistently larger than from the other lower-order approximations (Fig. 7). We do this by setting a I ¼ 0, which is equivalent to using the standard pairwise approximation for triples with infected central individuals. This is in part justified by the fact that values of a I are typically much smaller in magnitude than a S (Appendix H). This assumption further reduces the dimensionality of the system, as we have one less variable. Moreover, as a I is typically 6 0, ignoring it will increase _ a S , meaning we will generate higher values of a S . (Positive values of a I can only occur at very high values of s; at such values, the disease dynamics on the k-regular network are already well approximated by the standard pairwise approximation). Indeed, comparing shapes of a S (Fig. 7), we see this assumption provides a closer match to the values from the extended triple. In Fig. 8, we compare the endemic prevalence obtained using this a I ¼ 0 assumption against the extended triple approximation, as well as against the improved pairwise approximation where a I is a dynamic variable. Ignoring a I provides an estimate closer to the extended triple approximation than accounting for a I explicitly, Fig. 7. Exploring the shape of aS for different approximate models. Here we compare the shape of the error term aS as a function of ½SS and ½SI for improved pairwise models ((a) and (d)), and as a function of ½SxSc and ½IxSc for neighbourhood and extended triple approximations ((b) and (c)), for the example k ¼ 3. We observe that aS in the improved pairwise (a) and the neighbourhood (b) approximation models are extremely similar, but that the improved pairwise approximation model underestimates this error compared to the extended triple approximation (c). By assuming aI ¼ 0 and _ aI ¼ 0 (d), the resulting aS surface more closely resembles that of the extended triple model. In all plots, we set s ¼ 1; c ¼ 1.
which in the case of k ¼ 3 matches stochastic simulations closely. In Appendix I we consider the time evolution of these models.

Discussion
Whenever detailed information on underlying network structure is available, detailed stochastic simulation of an epidemic on a network is always the 'gold standard' for any real-world application. In the absence of such information, moment-closure approximation methods for the spread of infections promise relatively simple models that allow us to understand the effect of network structure on the dynamics of an epidemic. The success of such a method, however, depends upon understanding the errors introduced by moment-closure approximations, and upon refinements that minimise such errors. While this approach has been successfully applied to diseases with SIR-dynamics, the dynamic buildup of correlations between distant individuals for diseases with SIS-dynamics means success for infections with this natural history has been more limited. However, as the dynamics of most STIs can be well approximated by the SIS-paradigm, and given the importance of network structure in this case, further research into this area is paramount. Indeed, there is already a considerable body of literature concerning moment-closure approximations for SISdynamics (Taylor et al., 2012;Taylor and Kiss, 2014;Keeling et al., 2016;House et al., 2009;Simon and Kiss, 2015), as well as other network approaches to diseases of this type (Floyd et al., 2012;Lee et al., 2013;Wilkinson and Sharkey, 2013), demonstrating this as an active research area.
This study improves upon the standard pairwise approximation by explicitly tracking the errors between the 'true' value of triples and their estimate from this approximation. We show that these errors are fully described by the quantity a S for triples with susceptible central individuals, and by the quantity a I for triples with infected central individuals. By tracking the time-evolution of these error terms, we improve upon the standard pairwise approximation by incorporating these terms into the modelling framework. For the isolated open triple (just three individuals connected in a line), both _ a S and _ a I are exactly described as functions of ½S x S c ; ½I x S c ; a S and a I ; hence, in this case, the improved pairwise model is itself exact. For k-regular networks, _ a S and _ a I depend upon order-four structures. However, by approximating the prevalence of these structures via higher order moment-closures, we obtain expressions for _ a S and _ a I solely in terms of pairs, a S and a I . While such a model is not exact, explicitly modelling the time-evolution of these errors markedly improves upon the standard pairwise approximation for k-regular networks, obtaining prevalence estimates comparable both to models closed at even higher orders and to explicit stochastic simulations.
The findings of this paper contribute towards understanding the shape and direction of errors introduced by pairwise approximations. We show that the errors between triples and their standard approximation are quantified by just two values: a S and a I . Interestingly, we find numerically that a S P 0 and a I 6 0, which inform us as to whether the standard pairwise approximation underestimates or overestimates the proportion of certain triples. While both bounds hold for the isolated open triple, only a S P 0 holds in general for k-regular networks. This result also appears to hold for the constituent triples of all other investigated topologies (line graphs up to length 10, star graphs with up to 10 neighbouring individuals, the extended triple with no external force of infection), while the result a I 6 0 only appears to apply when central individuals in a triple have no other connections outside of the triple. We hence believe that an analytical exploration of such bounds could be fruitful, and would make an important contribution to this research area if such bounds could be proven generally. A deeper understanding of the shape, direction, and magnitude of such error terms is not only of interest to those concerned with using the improved pairwise approximation model described in this paper, but to any researcher interested in applying the standard pairwise approximation to a network model of a disease where recovery from infection does not lead to immunity.
In this paper, we compare approximations to the dynamics of kregular networks closed at increasingly higher levels of complexity -from individual, to pair, to neighbourhood, to an extended neighbourhood. As we increase the dimensionality of a model, we expect to obtain more accurate results. On the other hand, models of high dimensionality are difficult to understand intuitively and are much more computationally expensive. Whether including such complexity is worthwhile depends on the task at hand. We believe that our improved pairwise approximation provides a reasonable compromise between intuition and complexity -this model is still described by a small number of ODEs, and has dynamics closely resembling those from the model closed at the level of neighbourhoods, more closely matching prevalence estimates obtained from Fig. 8. Comparing improved pairwise approximations against higher-order approximations for k ¼ 2 and k ¼ 3-regular networks. We compare endemic prevalence I Ã obtained from improved pairwise model (full in orange; aI ¼ 0 in purple) against neighbourhood (blue) and extended triple (green) approximations, as well as against explicit stochastic simulations (points), as we vary k ¼ sk, for (a) k ¼ 2 and (b) k ¼ 3-regular networks. In both (a) and b) I Ã obtained from the improved pairwise approximation is very similar to I Ã obtained from the neighbourhood approximation. By assuming aI ¼ 0 and _ aI ¼ 0, the dynamics of the improved pairwise approximation are closer to those of the extended triple approximation, and match I Ã from stochastic simulations well for k ¼ 3. For all models we set c ¼ 1. For stochastic simulations, each I Ã point is calculated as the average of 150 runs, and error bars indicate 95% confidence intervals. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) stochastic simulations. An unexpected result is that by ignoring a I , i.e. using the standard pairwise approximation for triples with infected central individuals, one appears to obtain a better approximation to the true dynamics. It is important to establish if such a result holds generally, and if so why, or whether this result is a spurious convenience for k-regular networks.
The results here consider the two most ideal networks: the isolated open triple is the simplest possible network topology including three individuals, while in k-regular networks each individual has exactly k neighbours and there are no closed loops within the network. We consider these idealisations as it is in these networks that network structure is most dominant and the errors introduced by moment-closure approximations are most pronounced. But this means there is fertile ground for further exploration on both local and global scales. On a local scale, a taxonomy of the errors that occur for a variety of different small topologies, as has been done by Pellis et al. (2015) for diseases with SIR-dynamics, would be useful contribution to understanding the impact of local moment-closures for diseases with SIS-dynamics. On a global scale, understanding whether tracking the dynamics of error terms explicitly would be worthwhile in heterogeneous networks (building upon the work of Simon and Kiss (2015)), and assessing whether the same techniques can be applied in the presence of clustering, are important next steps.
This paper makes three assumptions common to the literature on the mathematics on epidemics on networks: first, that epidemiologically relevant contacts (the edges between nodes) are fixed throughout the epidemic and not dynamic; second, that these contacts are identical in kind, such that probability of infection for an individual from any partner of theirs is equal to any other partner; third, that individuals have exponentially distributed periods of infection (the Markovian assumption). Each of these are in some senses unrealistic: people's sexual partnerships change over time (it is a question of theoretical importance the extent to which the dynamics of epidemics on dynamics networks can by approximated by the dynamics of epidemics on static networks, which has begun to be explored (Volz and Meyers, 2007;Bansal et al., 2010)); for individuals in more than one partnership, the frequency of sexual contact will be different for each partnership, hence the probability of transmission across partnerships will also be different; whilst periods of infection may be better modelled as having a constant duration. For SIR-dynamics, a variety of dynamic network models incorporating moment-closure approximations, or other low-dimensional ODE models have been developed (Ball and Neal, 2008;Volz, 2008). So too are there a variety of dynamic network models for SIS-dynamics (e.g. Bauch and Rand, 2000;Leng and Keeling, 2018). Incorporating improved moment-closure approximations into such models, and exploring how the introduction of partnership formation and dissolution effects the errors introduced, are important next steps. While studies into the contribution of steady and casual partnerships to the spread of STIs has been explored (Xiridou et al., 2003;Hansson et al., 2019), heterogeneity in edge type is an underexplored topic for momentclosure approximations, even for diseases with SIR-dynamics. Assuming constant periods of infection, instead of making a Markovian assumption, can make closures exact for different network topologies in the case of SIR-dynamics (Pellis et al., 2015).
Exploring this alternative assumption and its effect on errors a S and a I may prove interesting avenues of research.
With regards to modelling the spread of STIs, it is clear that research should continue to develop more realistic and more sophisticated stochastic simulations. However, we believe that approximate methods have an important role to play, in both developing an intuitive understanding of the effect of network structure on the fate of the spread of STIs, and as a benchmark to compare such simulations against. It is in this context that improving the accuracy of such approximate methods is paramount, and it is in this context that we believe we make a valuable contribution to the literature.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
We gratefully acknowledge the Engineering and Physical Sciences Research Council and the Medical Research Council for their funding through the MathSys CDT (grant EP/L015374/1). We also thank the anonymous reviewers of this paper for their careful and considered comments that have led to a much improved manuscript.

A.1. Appendix A -An exact model for the open triple
Not assuming random mixing at initial conditions, eight ODEs are required to describe the dynamics of the open triple exactlythese are given below.
_ ½S x S c I y ¼cð½S x I c I y þ ½I x S c I y À ½S x S c I y Þ À s½S x S c I y ð 59Þ _ ½I x S c S y ¼cð½I x I c S y þ ½I x S c I y À ½I x S c S y À s½I x S c S y ð 60Þ _ ½S x I c S y ¼cð½S x I c I y þ ½I x I c S y À ½S x I c S y Þ À 2s½S x I c S y ð 61Þ _ ½S x I c I y ¼cð½I x I c I y À 2½S x I c I y Þ þ sð½S x S c I y þ ½S x I c S y À ½S x I c I y Þ ð62Þ _ ½I x I c S y ¼cð½I x I c I y À 2½I x I c S y Þ þ sð½I x S c S y þ ½S x I c S y À ½I x I c S y Þ ð63Þ _ ½I x S c I y ¼cð½I x I c I y À 2½I x S c I y Þ À 2s½I x S c I y ð 64Þ _ ½I x I c I y ¼ À 3c½I x I c I y þ sð½S x I c I y þ ½I x I c S y þ 2½I x S c I y Þ

A.3. Appendix C -Converting the improved pairwise model from proportions to numbers
While we find that considering proportions is a more convenient way to express the results from Section 3, we appreciate that others may prefer to use our results under the convention of terms referring to numbers of motifs. In this appendix we provide a conversion table to transform the terms from this section from proportions to numbers, and derive the improved pairwise model in terms of numbers. We stress that the improved pairwise model presented here is equivalent to the model presented in Section 3.
First, to express the quantities in Section 3 in terms of numbers, we must be able to count the number of motifs relative to every individual. In a k-regular network, for every individual there are k pairs, for every pair there are ðk À 1Þ triples, and for every triple there are ðk À 1Þ line graphs of length 4 and ðk À 2Þ 4-stars. Using straight line brackets jXj to denote the number of individuals in state X etc. Table 1 below outlines equivalent terms: Using these conversions for example on Eqs. (30) and (32), we obtain the formally derived equations obtained by Taylor et al. (2012) (Theorem 1). We can also use these to convert our closures from proportions to numbers. Applying these, we obtain the unintuitive result that the closure ½XCY % ½XC½CY=½C in terms of proportions is equivalent to the closure jXYZj % ðk À 1Þ=kÂjXCjjCYj=jCj in terms of numbers. Applying these conversions to this and to Eqns. (46) and (47)  To obtain the improved pairwise approximation in terms of numbers, we again consider the term a between a triple and its approximation. Below we consider a jISIj : As before, we find a jISIj ¼ a jSSSj ¼ Àa jSSIj ¼ a jSj =ðk À 1Þ. Defining a jIj as jIIIjjSISj À jSIIj 2 , we similarly find a jSISj ¼ a jIIIj ¼ Àa jSIIj ¼ a jIj =ðk À 1Þ. By applying the conversions from the table to the equations from Appendix D, we can obtain expressions for the rate of change of a jSj and a jIj : and where b jSj ¼ 2ðk À 1ÞðjI a S x S c I y jjSSj À jI a S x S c S y jjSIjÞ þ 2jS x S c I y I z jjSSIj À jS x S c S y I z jjISIj À jI x S c I y I z jjSSSj ð 74Þ _ a jIj ¼ À4ca jIj þ sðb jIj þ 2/ jIj À 2a jIj Þ ð 75Þ where / jIj ¼ 2jSISjjISIj À 2jSSIjjSIIj ð 76Þ and where b jIj ¼ 2ðk À 1ÞðjI a S x I c I y jjSIj À jI a S x I c S y jjIIjÞ À 2jS x S c I y I z jjSIIj þ jI x S c I y I z jjSISj þ jS x S c S y I z jjIIIj ð 77Þ Finally, rearranging Eq. (68) and its analogues, substituting in a jXj we can obtain the closure for triples in the improved pairwise approximation: Thus we arrive at the improved pairwise approximation for kregular networks, expressed in terms of numbers rather than proportions: Model 6 in terms of numbers -The improved pairwise approximation for k-regular networks _ jSSj ¼2cjSIj À 2s ðk À 1Þ 2 jSSjjSIj À a jSj kðk À 1ÞjSj A.4. Appendix D -The rate of change of triples in a k-regular network The state of triples in a k-regular network depend upon the state of order-four network structures: line-graphs of length four (½A a X x C c Y y ; ½X x C c Y y B b ) and star graphs with three outer individuals (½X x C c Y y Z z ). Assuming random initial conditions, by the symmetry of the system ½X x C c Y y B b ¼ ½B a Y x C c X y , meaning only one length four line-graph term is needed in the equations below. Given that ½SSI ¼ ½ISS and ½SII ¼ ½IIS, the rates of change of these triples are described by six ODEs, which can be derived from the system of Eqs. (12) described by House et al. (2009) by omitting terms that include closed loops and by converting the equations from numbers to proportions via the table in Appendix C.
_ ½SSS ¼ cð2½SSIþ½SISÞ À sðk À 2Þ½S x S c S y I z À 2sðk À 1Þ½I a S x S c S y ð 83Þ _ ½SSI ¼ cð½SIIþ½ISIÀ½SSIÞ À s½SSI À sðk À 2Þ½S x S c I y I z þ sðk À 1Þð½I a S x S c S y À ½I a S x S c I y Þ ð84Þ _ ½ISI ¼ cð½IIIÀ2½ISIÞ À 2s½ISI À sðk À 2Þ½I x S c I y I z þ 2sðk À 1Þ½I a S x S c I y ð 85Þ _ ½SIS ¼ cð2½SIIÀ½SISÞ À 2s½SIS þ sðk À 2Þ½S x S c S y I z À 2sðk À 1Þ½I a S x I c S y ð86Þ _ ½SII ¼ cð½IIIÀ2½SIIÞ þ sð½SISþ½SSIÀ½SIIÞ þ sðk À 2Þ½S x S c I y I z Þ þ sðk À 1Þð½I a S x I c S y À ½I a S x I c I y Þ ð87Þ _ ½III ¼ À3c½III þ sð2½SIIþ2½ISIÞ þ sðk À 2Þ½I x S c I y I z þ 2sðk À 1Þ½I a S x I c I y A.5. Appendix E -The extended triple model For this model, we model a triple and each of its neighbours explicitly. Thus, for a k-regular network, 3k À 1 individuals are modelled explicitly, meaning 2 3kÀ1 equations are required to describe this model. By accounting for symmetries in the extended triple topology, one could reduce the dimensionality of this system. However, the method constructing the set of ODEs algorithmically described below models each state explicitly. Writing an algorithm that accounts for such symmetries, while possible, would be somewhat cumbersome, and as such we did not decide to pursue this. We approximate the external forces on this topology by assuming that the higher-order structures that the rate of change of states depend on can be approximated by conjoined extended triple topologies conditionally independent on the state of shared individuals.
We construct the extended triple model in two steps. Firstly, we construct a model with SIS-dynamics on the finite topology of the extended triple. To construct this model, we provide an algorithm for constructing SIS-models on graphs with any arbitrary finite topology. Secondly, we add on external force of infection to this model, which we achieve via relabelling.

E.1. An algorithm for constructing SIS-models on graphs with arbitrary topology
In this section we outline an algorithm for constructing a model with SIS-dynamics on networks of arbitrary topology. We can rewrite the full equations for the open triple in matrix form as follows: if we let x ¼ f½SSS; ½SSI; ½SIS; ½SII; ½ISS; ½ISI; ½IIS; ½IIIg T . Then dx dt States are ordered in this way so that they are interpreted as a binary string (e.g. ½SSS as 000). For the open triple, these are given by: of ½S x S c S y ; S x 0 I x 1 I c 0 I y 0 I y 1 will depend upon some order 10 terms: ½S x S c S y ; S x 0 I x 1 I c 0 I y 0 I y 1 ; I x 00 S x 01 , ½S x S c S y ; S x 0 I x 1 I c 0 I y 0 I y 1 ; S x 00 I x 01 , and ½S x S c S y ; S x 0 I x 1 I c 0 I y 0 I y 1 ; I x 00 I x 01 . We make a closure at this level by assuming, to take the first of these as an example: ½S x S c S y ; S x 0 I x 1 I c 0 I y 0 I y 1 ;I x 00 S x 01 % ½S x S c S y ;S x 0 I x 1 I c 0 I y 0 I y 1 Â ½S x 0 S x S c ; I x 00 S x 01 I x 1 I c 0 S y ½S x S c S y ; S x 0 I c 0 However, as we have not modelled x 00 and x 01 explicitly, the probability of state ½S x 0 S x S c ; I x 00 S x 01 I x 1 I c 0 S y remains undefined. However, as we start from random initial conditions, and given that a kregular is isotropic, all extended triples within a k-regular network are equivalent. Because of this, we have: Thus, we obtain an expression for this state by taking into account the symmetry of a k-regular network, and by relabelling individuals so that states containing individuals not explicitly modelled are defined in terms of explicitly modelled individuals exclusively. We can now arrive at an expression for the external force of infection acting upon state A (k A ), which is given by: where 1 P¼I and 1 Q ¼I are indicator functions.E.3. Relabelling generally The particular relabelling depends upon the particular state of the external triple, and upon the particular neighbouring individuals whose external force of infection you are considering. The requires labellings for the k ¼ 3 case are given in Table 2, and the required relabellings for a general k is given in Table 3. The header row gives the neighbouring individual whose external force of infection we are considering, while the leftmost column gives the new positions of states in a given column now occupy. External nodes that contribute to the external force of infection always occupy the relabelled x i positions.E.4. Constructing the extended triple model To make the extended triple model, we begin by constructing the model for the relevant finite topology with SIS-dynamics, as outlined previously in this section. To construct a model approximating a k-regular network, we must add an external force of infection to individuals neighbouring the central triple. The procedure is as follows:  Table 3 Relabelling for general k.
Individual 1. Construct ODEs for the SIS-dynamics for a graph of the relevant topology, with the central triple as the first three rows of the adjacency matrix. 2. Express each state N as a binary b (of length l ¼ 3k À 1). 3. For each vector b, loop through entries i 2 f4; . . . ; lg. If bðiÞ ¼ 1 calculate the external force of infection on this node, I ext , by relabelling.
5. Let e be the decimal number obtained by changing bðiÞ from 1 to 0, and let E be the state corresponding to this number. Add on the sNI ext to this ODE (i.e. _ E ¼ _ E þ sNI ext Þ.
A.6. Appendix F -Convergence to the mean-field approximation as k ! 1 We believe that as k ! 1, all models converge to the meanfield approximation. In this section, we show this is true for both the pairwise and improved pairwise approximation models, and outline how this would be approached in the general case.
For all models Eqs. (30) ( _ ½S) and (32) ( _ ½SI) hold exactly -only beginning to differ at the level of triples. Our contention is that as k ! 1, ½SI ! ½S½I. First, we note that because ½SI ¼ ½S À ½SS, ½SI ¼ ½S½I () ½SS ¼ ½S 2 . We consider _ ½SS, Fig. 11. Numerical exploration of aS and aI for different approximate models of the k-regular network. We consider how minðaX Þ and maxðaX Þ, X 2 fS; Ig vary with sk for different approximate models of the k-regular network: Improved pairwise (left column), neighbourhood (centre column), extended triple (right column). These plots demonstrate the bound aS P 0 holds for all approximations of the k-regular network, but that aI 6 0 only holds for the case k ¼ 2. For k > 2, maxðaIÞ > 0 given s is sufficiently high. These transmission rates correspond to high endemic prevalences -in all cases I Ã > 0:8. In all plots we set c ¼ 1. Now, we introduce k ¼ sk, which remains constant as k increases. We make the assumption that ½SS ¼ ½S 2 initially and consider their time evolution: These equations are equal, and therefore the relationship ½SS ¼ ½S 2 continues to hold, conditional on ½SSI ¼ ½S 2 ½I. In general we need to show that the relationship ½SSI ¼ ½S 2 ½I continues, given that it holds initially.F.1 -Convergence for the pairwise approximation model Under the standard pairwise model, ½SSI ¼ ½SS½SI=½S. Assuming that ½SS ¼ ½S 2 , it is clear that ½SSI ¼ ½S 2 ½I. Given that at t ¼ 0; ½SS ¼ ½S 2 , and that ½SS ¼ ½S 2 ) _ ½SS 2 the convergence of the standard pairwise model is proved by induction.F.2 -Convergence for the improved pairwise approximation model Under this model ½SSI ¼ ð½SS½SI À a S Þ=½S, i.e. ½SSI ¼ ½S 2 ½I () ð½SS ¼ ½S 2 ; a S ¼ 0Þ. Let us assume that ½SS ¼ ½S 2 , a S ¼ 0, and a I ¼ 0. Then ½SSI ¼ ½S 2 ½I and by examining Eqns. (40) and (43), we find that _ a S ¼ 0 and _ a I ¼ 0. Given that at t ¼ 0; ½SS ¼ ½S 2 ; a S ¼ 0; a I ¼ 0, and that ½SS ¼ ½S 2 ; a S ¼ 0; a I ¼ 0 ) _ ½SS ¼ _ ½S; _ a S ¼ 0; _ a I ¼ 0, the convergence of the improved pairwise model is proved by induction.F.3 -Convergence in the general case More generally, we believe that as k ! 1, spatial correlation at a particular level is only introduced by spatial correlations at a higher level. For example, correlations only enter the pairwise model if there are correlations at the level of pairs, correlations only enter the improved pairwise model if there are correlations at the level of pairs and triples (a terms), etc. Given that by assumption we start with no spatial correlation at any level, it follows that correlations are never introduced. However, we believe that the proof of this more general claim is beyond the remit of this paper.
A.7. Appendix G -Exploring a S and a I for different approximate models of the k-regular network Fig. 10,11 A.8. Appendix H -Exploring the shape of a I for different approximate models Fig. 12 Fig. 12. Exploring the shape of aI for different approximate models. Here we compare the shape of the error term ÀaI as a function of ½SS and ½SI for the improved pairwise model (a), and as a function of ½SxSc and ½IxSc for neighbourhood and extended triple approximations ((b) and (c)), for the example k ¼ 3. We observe that aI surfaces in all three models are very similar, and that their magnitude is much smaller than their corresponding aS surfaces (Fig. 8). In all plots, we set s ¼ 1; c ¼ 1.