An epidemic model for an evolving pathogen with strain-dependent immunity

Between pandemics, the influenza virus exhibits periods of incremental evolution via a process known as antigenic drift. This process gives rise to a sequence of strains of the pathogen that are continuously replaced by newer strains, preventing a build up of immunity in the host population. In this paper, a parsimonious epidemic model is defined that attempts to capture the dynamics of evolving strains within a host population. The `evolving strains' epidemic model has many properties that lie in-between the Susceptible-Infected-Susceptible and the Susceptible-Infected-Removed epidemic models, due to the fact that individuals can only be infected by each strain once, but remain susceptible to reinfection by newly emerged strains. Coupling results are used to identify key properties, such as the time to extinction. A range of reproduction numbers are explored to characterize the model, including a novel quasi-stationary reproduction number that can be used to describe the re-emergence of the pathogen into a population with `average' levels of strain immunity, analogous to the beginning of the winter peak in influenza. Finally the quasi-stationary distribution of the evolving strains model is explored via simulation.


Introduction
Epidemic models have become important tools for understanding, predicting and developing mitigation strategies for public health planners dealing with infectious diseases. Recent advances in genetic epidemiology have greatly accelerated our understanding of the complex interactions between host immunity and pathogen evolution, and emphasised the important role that pathogen evolution can have on the dynamics of infection. However, it remains extremely challenging to combine together these two interacting processes within the same mathematical framework [1]. In this paper we develop a parsimonious epidemic model that describes the transmission dynamics of a multi-strain pathogen with evolutionary dynamics similar to the influenza A virus evolving via antigenic drift.
Multi-strain models have become increasingly popular due to the rise in availability of pathogen genetic analyses. Many models have been based on ordinary differential equations (ODE), despite the fact that stochastic effects play an important role in mutation [see 1, for a review]. Bichara et al. [2] develop an epidemic model with competition between finitely many pathogen strains, and include vertical transmission and immunity from maternal antibodies in the infection dynamics. Meehan et al. [3] analyse multi-strain epidemic models with mutation between strains within an ODE framework. However since their focus is on drug-resistance, they do not consider the effect of immunity. In the multi-strain models discussed in Gog et al. [4], there is assumed to be a finite number of possible strains, and each individual may be infected with one or more of such strains. Evolution was been modelled by a random jump process on a finite strain space using a nearest neighbour jump process. Models involving a countable number of infectious statuses have been discussed in the past [5], but these typically only use the previously mentioned nearest-neighbour evolution.
In [5] this is expressed as a model for parasitic infections where the "type" of an individual is defined by the quantity of parasites in a host. Despite the many modelling papers on multi-strain epidemics, the methodology required to fit these models to data is only just emerging [6].
Between pandemics, the 4 main sub-types of the influenza virus evolve according to a process called antigenic drift [7]. Antigenic drift arises due to the fact that infection with a particular strain of influenza provides the host with a long-lasting immunity to future infection by the same strain. Once immunity to a particular strain has built up in the population, there is a selection advantage to strains that do not elicit the same immune response. To capture within a mathematical model the complex processes driving the evolution of the influenza virus is extremely challenging due to the interactions between host immunity and viral evolution [8]. Nonetheless, simple models can give rise to surprisingly complex dynamics [9,10]. The H3N2 subtype of influenza A, in particular, exhibits a narrow spread in its evolutionary tree, with all strains a short genetic distance from a single branch [11,12]. Each strain persists for a relatively short amount of time before being replaced.
In this paper we define a novel epidemic model with countably infinite, evolving strains that sits between the traditional susceptible-infected-susceptible (SIS) and susceptible-infected-removed (SIR) epidemic models, in that each individual may be infected many times with the pathogen, but only once by a strain. The model is designed to reflect the linear pattern of evolution observed in pathogens undergoing antigenic drift, such as seasonal influenza. By introducing an equivalence relation on the state space, we are able to describe the equilibrium behaviour of the model prior to elimination of the pathogen. In Section 5, coupling arguments are used to make precise the relationship between our new model and the traditional SIS and SIR models and to explore the large population limit. In Section 6 we discuss three reproduction numbers for the novel model. Finally in Section 7 we explore simulations from the quasi-equilibrium distributions.

SIRS with evolving strains (E-SIRS)
Consider a closed population of N individuals which are classified as susceptible, infective or removed. For each time t, we denote the number of susceptibles by S(t), the number of infectives by I(t), and the number of removed individuals by R(t). An infective remains in this class for a random period of time known as their infectious period, after which they become removed. Similarly, removed individuals become susceptible again after their immune period, during which they cannot be infected by any strain (even a new one). We assume the durations of infectious periods are i.i.d. draws from L I ∼ Exp(γ) and immune periods are i.i.d draws from L R ∼ Exp(δ).
To capture dynamics of competing and evolving strains, every individual has a strain index k ∈ Z which denotes the most recent strain with which an individual was (or is currently) infected. We denote the number of susceptibles, infectives and removed individuals respectively with strain index k by S k (t), I k (t) and R k (t). Finally we denote by K * (t) the largest strain index observed up to time t and use K * := K * (t − ) where the time is clear from context. As in the standard SIRS model [13], we assume homogeneous mixing of individuals, and so each pair of individuals makes contact at the points of a Poisson process with rate β N > 0. New strains are introduced into the population in the following way. Each time an infective makes contact with a susceptible individual, we assume that with some probability θ ∈ [0, 1], there is a successful infection of the susceptible with a previously unseen strain, which is given strain index K * (t − ) + 1. With probability 1 − θ, the original strain in the infective attempts to infect the susceptible; the success of this infection depends on the strain index of the susceptible. For simplicity, we assume that immunity is cumulative: a susceptible with strain index k is immune to all strains with index j ≤ k. Removed and susceptible individuals retain the strain index of the strain they have most recently recovered from. All contact processes, mutation events, infectious periods and immune periods are assumed to be independent from each other.
To summarise, the epidemic proceeds according to the following events.
• Infection without mutation: • Infection with mutation: • Recovery: • Loss of global immunity: The state space of the E-SIRS model is given by Ω = {(s, i, r) : k∈Z (s k + i k + r k ) = N }, with s, i, r being infinite sequences taking values in {0, 1, . . . , N }.
A natural initial condition might be I 1 (0) = 1, I k (0) = 0 for k = 1, S 0 (0) = N − 1, S k (0) = 0 for k = 0, and R k (0) = 0 for all k. This equates to a single currently infective individual infected with a strain to which all other individuals are susceptible, and to which no-one is currently recovering.

SIS with evolving strains (E-SIS)
Consider a second model where following an infectious period, an individual becomes immediately susceptible, corresponding to the E-SIRS model where δ = ∞. In this model there are no periods of immunity and so recovery events generate susceptibles: • Recovery: plus infection transitions as above. The E-SIS model evolves over the subspace {(s, i) : k∈Z (s k + i k ) = N }. We will refer to both spaces by Ω , the meaning will always be clear from context.

Link to single-strain models
Consider the E-SIS model with θ = 1. All contacts are mutation contacts and hence successful, and so (S(t), I(t)), the total numbers of susceptibles and infectives, follow a traditional single-strain SIS model as defined in [14]. We can also perform a similar identification between (S(t), I(t), R(t)), the number of susceptibles, infectives and immune individuals in the E-SIRS model and the single-strain SIRS model as defined in [13].
On the other hand, consider the E-SIS model with θ = 0. Since no contacts are mutations, no individual can be infected more than once. If the population starts with strain index 0 except for the initial infectives with strain 1, (S 0 (t), I 1 (t), S 1 (t)) behaves as a traditional single-strain SIR model as defined in [14].

Equivalence relation
We wish to study the long-term average behaviour of characteristics such as the levels of immunity and pathogen diversity, however the constant emergence and extinction of strains means that the evolving epidemic process has no steady-state. To counter this we introduce an equivalence relation to fix the process against the most recently emerged strain. Definition 1. The active strain set of a state (s, i, r) ∈ Ω is given by K = {k ∈ Z : i k > 0}. Let elements of this set be indexed from 1 to |K| in ascending order, so for k a , k b ∈ K, we have k a < k b whenever a < b.  In order to easily refer to the equivalence classes, we define the following representative of each equivalence class. is given by: If K = ∅ then i * k = 0 for all k ∈ Z and s * 0 = j∈Z s j and s * k = 0 for k = 0, and similarly for r. Let {0} denote the set of all these absorbing states.
In the rest of this paper, the process of representatives on the space of equivalence classes (states described with starred states as in Definition 3) will be referred to as the normalised process, and will be denoted by (S * , I * , R * ) or (S * , I * ) as appropriate. Definitions 2 and 3 remove all strains with no infective individuals, and give index 0 to the most recent strain to have emerged and have infectives. All susceptibles and removed individuals are given the strain index one less than the nearest infective above them in strain order. Any individuals immune to all existing strains are given strain 0, as though they just recovered from the most recently emerged strain. Notation. Recall that without the equivalence relation, the state space of the epidemic process was Ω . We denote the state space of the normalised process over the set of equivalence class representatives by Ω.
We will use x = (s, i, r) ∈ Ω with, for example, s = (s k ) k ∈ {0, . . . , N } Z to denote a typical element of the state space. We will also use, for example, |s| = k∈Z s k to denote the total number of susceptibles. A random variable written in calligraphic type, e.g. R k (t), refers to a process without the equivalence relation. The corresponding variable written in roman type, e.g. R k (t), refers to the normalised process evolving over representatives from the equivalence classes.

Quasi-stationarity and absorbing states
Like many infectious disease models, the E-SIRS and E-SIS models defined in Section 2 have an absorbing class of states that corresponds to the population containing no infected individuals, i = 0. For finite population models, the absorbing state is reached with certainty in finite time, and so the limiting distribution is degenerate with no mass in non-absorbing states. However, like the single strain SIS model, these processes may not go extinct for a long time (individuals can be reinfected) and the transient quasi-stable behaviour is of interest. The quasi-stationary distribution and limiting conditional distribution conditioned on the epidemic not going extinct, represent the long-term average behaviour for an endemic disease.

Properties of quasi-stationary distributions for epidemics
In this section and the rest of this paper, Definition 5. Let X = (X(t)) t≥0 be a Markov process on a countable state space Ω with absorbing state 0 from which it cannot escape. Then a distribution processes where S is a single communicating class, every QSD u is a u-LCD and every LCD is a QSD.
Related to the QSD on irreducible state spaces is the notion of the decay parameter which describes the rate of decay of the transition probabilities. Definition 6. Let X = (X(t)) t≥0 be an irreducible Markov process on a countable state space Ω with absorbing state 0. Let u be a QSD associated to X.
Then the decay parameter α is given by The absorption parameter α 0 is given by where T is the extinction time of X starting from state i. Note that for irreducible processes, α is independent of i, j and α 0 is independent of i.
According to Theorem 6 of [15], a necessary condition for the existence of a QSD is that α > 0.
Theorem 7. Conditional on non-absorption, the following hold.  3. The QSD for the number of susceptibles and infectives in the single-strain SIRS model exists uniquely, and gives weight to all non-absorbing states.
Proof. Theorem 1 of [16] states that QSDs exist and are unique on finite irreducible state spaces, and so there is a unique QSD for the SIS model and for the SIRS model conditional on {I > 0}, and non-zero weight is given to all nonabsorbing states. For reducible processes, Theorem 8 of [16] states that QSDs will give full weight to the communicating class with the longest expect time to leave and any states accessible from this "slowest" communicating class. This characterises the QSD for the SIR model.
Further work on characterising the QSD for the standard SIS model can be seen in [17,18,19] making use of recurrent processes and normal approximations.

Existence and uniqueness
Here we will summarise the existence and uniqueness results for the E-SIS and E-SIRS processes. Proof. For θ ∈ (0, 1], we obtain existence and uniqueness by proving that S = Ω\{0} is a single finite communicating class, which immediately gives existence, uniqueness and equality of the QSD and LCD by Theorem 3 from [15].  communicating class also follows as in Theorem 8. For θ = 0, we see that each state that can be reached is a transient communicating class; there is no way to return to a state once left. As such, we need to consider the decay parameter on leaving each state, which equals the exponential rate of leaving such a state: The decay parameter for the process is therefore the minimal such value across all non-absorbing states. Therefore s −1 = 0, i 0 = 1 and r 0 = 0 minimise the decay parameter. According to Theorem 1 of [16] this forces the QSD to have full mass on this state where i 0 = 1, s 0 = N − 1, since the only state accessible from this state is an absorbing one.

Sampling the quasi-stationary distribution
Samples from quasi-stationary distributions can be produced using the sequential Monte Carlo (SMC) sampler with stopping time resampling methods developed in [20]. In brief, M realisations of the model (referred to as particles) are simulated forward in time. Absorbed particles (with no infected individuals) are given weight zero and non-absorbed particles are given weight 1 initially.
The distribution of weights converges to the limiting conditional distribution.
Once the total weight drops below a proportion λ of the initial weight, the particles are replenished via a resampling step. Combine-split resampling [20] was used, which prevents any occupied states from being lost and has the advantage that the distribution of weights remains unchanged after resampling.
This resampling method combines particles in the same location together, draws new particle locations uniformly from the existing locations and equalizes the weight on particles in the same location. In our implementation, after a burn-in

Limiting behaviour
One aspect of interest is how the E-SIS and E-SIRS processes relate to those without mutation. To this end, we consider the limits of the times to extinction of the processes as θ tends to 0 or 1, and the limit, for fixed θ, of the time to extinction as the population size N tends to infinity. Large population limits can be used to justify the use of infinite population models as approximations for, for example, the decay parameters for the relevant processes which we cannot obtain analytically.  Proof. We make use of a coupling of (T θ : 0 < θ < 1) and T 1 . Firstly, let X θ (t) = (S θ (t), I θ (t)) be the E-SIS model. We assume the process to be defined over a population indexed by n = 1, . . . , N .  > θ. Since we must have a finite number of such events occurring in [0, T 1 ), we can find θ * such that U (n,n ) i ≤ θ * for all such U (n,n ) i corresponding to potentially unsuccessful events. This means that for θ ≥ θ * we must have T θ * (ω) = T 1 (ω). So for every ω ∈ E, there exists θ * ∈ (0, 1) such that T θ (ω) = T 1 (ω) for all θ ≥ θ * . Hence T θ (ω) → T 1 (ω) as θ → 1 for almost every ω ∈ E, and hence T θ → T 1 in distribution by the Skorohod Dudley theorem from, for example, Theorem 3 of [21].

Limits as mutation probability changes
Theorem 11. Let T 0 be the time to extinction of the standard SIR model. Then Intuitively, one can think of identifying the S 1 -class for the E-SIS model and the R-class of the SIR model. As mutation events get rarer, the chance of mutation happening before extinction becomes smaller and smaller, and so the two processes are more likely to coincide under a suitable coupling until extinction.
Proof. This proof follows in a similar fashion to Theorem 10. In this version, the coupling is constructed between the E-SIS and the SIR model. The only differences are that in the SIR model individuals enter the removed category after their infectious period and the Poisson process {A (n,n ) (u)} progresses during any time for which n is infective and n is susceptible in the E-SIS model (as before); but when n is susceptible or removed in the SIR model.
In the E-SIS model infectious contacts between n and n are only successful if the event is a mutation or n is of a strictly lower strain index than n. In the SIR model, only the first infectious contact is successful. This means that the two epidemics must be identical up to the time of the first repeat contact, when one identifies the {S 1 , S 2 , S 3 , . . . } classes in the E-SIS model with the R class of the SIR model. Similar to the proof of Theorem 10, for each ω ∈ E one can find a value of θ * so that T θ * (ω) = T 0 (ω) for all θ ≤ θ * , and so T θ → T 0 almost surely as θ → 0 and hence T θ → T 1 in distribution by the Skorohod Dudley theorem of [21].

Large population limits
In order to obtain some large population limit results, we will consider an "infinite" population model. We will refer to this as an Evolving Birth-Death Process (E-BDP). More precisely we assume that infected individuals are a negligible part of an infinite population of individuals that are not immune to any strains at the start of the epidemic, and so all infections will be successful almost surely. This implies infections from a given strain k and recoveries from that strain behave as a linear birth-death process with birth rate β and death rate γ. Additionally, at the point of each infection, with probability θ ∈ [0, 1], the new infective is infected with a previously unseen strain, and given the next available strain index K * + 1.
After it emerges, each strain behaves according to a linear birth-death process with birth rate β(1 − θ) and death rate γ. The total number of infectives I(t) also behaves according a birth-death process with birth rate β and death rate γ.
The time to extinction of the E-SIS model converges to that of the E-BDP model, noting that under a suitable coupling, the time to extinction of the E-BDP equals the Linear BDP without mutation. This leads us to the following result.
Theorem 12. Let T θ,N be the time to extinction of the E-SIS model, and T the time to extinction of the E-BDP model. Then we have T θ,N → T in distribution as N → ∞ when β < γ. If β ≥ γ, then on the event {T < ∞}, a region of probability 1 − γ/β, we also have T θ,N → T in distribution as N → ∞ Proof. Using Theorems 10 and 11 we can conclude that for any fixed N that T 0,N is the time to extinction for the standard SIR model, and T 1,N is equal to the time to extinction for the standard SIS epidemic model. Furthermore, from these theorems we can construct a coupling of the SIS, E-SIS and SIR models using two sets of Poisson processes and mutation indicator variables such that, for any θ ∈ [0, 1], for almost every ω ∈ E. From [22], we know that if β < γ then T 0,N converges in distribution to T , the time to extinction of a Linear BDP with the same parameters β and γ. From [23] we obtain that the same thing happens for SIS models, i.e. T 1,N → T in distribution as N → ∞. Using the bounds in Equation In the case where β ≥ γ we note that on a set of probability 1−γ/β, the time to extinction of the linear BDP is infinite, as discussed in Chapter 3.2 of [24].
From [23], we know that T 1,N → T almost surely (and hence in distribution) on the event {T < ∞}. From [22] we know that on this event, T 0,N → T in distribution. Therefore we must have that T θ,N → T as N → ∞ for all θ ∈ [0, 1] here too.
It should be noted, that on the event {T = ∞} we don't have T 0,N → ∞. Instead T 0,N converges to an extreme-value distribution as mentioned in Theorem 8.1 [14].
Next we show existence of a QSD for the E-BDP model. Proof. To prove existence, we first show that the state space we are interested in is countable. To do this we use the following construction. Starting with a single infective of strain 0, we can define a method of constructing the state space. By having a birth in strain 0, or a mutation event, one can systematically arrive at any state in the state space. Given these two possible events, one can encode each state according to a finite binary sequence, which corresponds to a unique integer which we can use to enumerate the space. Given that there exists a lower bound l ∈ Z such that I k = 0 for all k ≤ l, we construct the state as follows. Starting with the lowest non-zero strain index l + 1 consider I l+1 strain 0 within-strain-infection events. Then for each higher strain k, we choose a mutation event followed by I k − 1 within-strain infection events. Note that only considering finite sequences gives countability, unlike the uncountability of the infinite paths on this binary tree.
To obtain existence of a QSD, we now introduce a coupling. Let X(t) = (X j (t)) j∈Z be the E-BDP. Let α X be the decay parameter for X(t). Let (Y (t)) t≥0 be the process defined on the same probability space, given by Y (t) = j∈Z X j (t). Since the mutations do not affect whether infections are successful or not, Y (t) can be seen to be a single-strain linear BDP with birth rate β, and death rate γ. As discussed in Example 1 of [25], Y (t) has the decay parameter α Y = γ − β. Let T X be the extinction time of X(t) and T Y for Y (t).
Letting α Y 0 be the absorption parameter for Y (t), and α X 0 for X(t), we also know that α Y 0 = γ − β. Since T X = T Y under the coupling, we use the definition of the decay parameter to deduce that α X 0 = γ − β, and hence α X ≥ α X 0 > 0. Using Theorem 13 of [15] we get existence of a QSD. Moreover, using Theorem 3.3.2 of [26], we must have α X = γ − β since there is only one state from which extinction can occur: one must have 1 infective before extinction, which must be of strain 0 under the equivalence relation. This leads to the uniqueness of the QSD.

Reproduction numbers
To characterise the dynamics of the models, we look to a number of key statistics which are related to the commonly used basic reproduction number, R 0 , that illustrates whether or not an epidemic is likely to infect a large proportion of the population. The basic reproduction number is defined as the number of individuals infected by a single typical infective in a large, otherwise susceptible population [27]. In the E-SIRS model, we still have R 0 = β/γ. One issue with R 0 is that it fails to take into account the likely immunities present in the population, or how much the pathogen evolves during the opening phase of the epidemic.

Modified household reproduction number R *
In [28], an epidemic is considered which spreads through a population grouped into households, such that individuals in the same household make contact at a different rate to individuals in different households. The households reproduction number R * is shown in [28,Section 2.3] to be equal to R * = µR H where µ is the expected number of individuals infected in a single household epidemic (including the initial infective), and R H is the the mean number of contacts an infective individual makes with individuals in other households during a single infectious period.
For the E-SIRS model, we consider each strain as a "household" which has countably many individuals, and mutations are considered contacts between households. In this case µ is the expected total population of a birth death process with birth rate β(1 − θ) and death rate γ, including the initial infective.
One can use the branching property to compute E[Z] = γ/(γ − β(1 − θ)) and note that the between household reproduction rate R H = βθ/γ and so, To recontextualise this in terms of strains and mutations, one can think of R H as the expected number of new strains originating from a single individual during one infectious period, and µ as the expected number of individuals that ever get infected by a specific strain.
One could consider R 0 to be the "intra-strain" reproduction number, and µ to be the "inter-strain" reproduction number. With these we obtain one of three regimes: • If R 0 = β/γ < 1, then the whole population would die out with certainty, and no large epidemic would occur.
• If R 0 ≥ 1 and µ < ∞ then a large epidemic occurs with positive probability, but each individual strain dies out quickly.
• If R 0 ≥ 1 and µ = ∞, then each strain has a positive probability of producing a large outbreak. The trees highlight how only a small number of the strains survive for a long time, particularly in Fig. 2(a). Figure 2 also show similarities to the tree for H3N2 in Extended Data Figure 9(c) of [12], a paper which specifically looks to model influenza.

Quasi-stationary reproduction number R Q
One drawback to the R 0 is that it only usefully describes the initial behaviour of an epidemic in a naive population and doesn't take into account the build up of immunity in the E-SIRS model. One alternative is to consider the effective reproduction number, denoted R t , defined as R t = R 0 N in a population of size N . Much work has been done in trying to evaluate R t for specific infections such as influenza by [29] and Ebola by [30]. However, R t is time-dependent and can therefore be difficult to compute and interpret. At a quasi-stable equilibrium the number of new infections balances the recoveries and so R t ≈ 1, and hence R t is not informative about the disease characteristics. Ideally, we would like a reproduction number that adjusts for the build-up of immunity in the population, but remains informative about the infectivity of a disease.
We offer an alternative reproduction number, based on the QSD, which aims to describe the infectiousness of strains of an endemic disease in a population with 'average' levels of historical immunity. The quasi-stationary reproduction number (R Q ) is the average number of secondary infections caused by a single typical infective introduced into an otherwise uninfected (S status) population with levels of immunity (strain indexes) drawn from the quasi-stationary distribution, so each other individual may or may not be immune to the current strain of the infective. By typical infective, we mean an individual with strain index sampled from the distribution of strain indexes of infectives in the QSD.
Under the E-SIRS model, the total number of infectives is always less than the SIS model without evolving strains, and so R Q ≤ R 0 .
The quasi-stationary reproduction number provides a measure of the ability of a pathogen to re-invade a population from which it has been eradicated. For diseases like seasonal influenza which have greatly reduced incidence during the summer months, R Q measures the reproduction number at the beginning of the next influenza season after accounting for the residual immunity left over from last year.
More precisely, we draw the single infective from the marginal number of infectives in the QSD u I (k): the probability that given an individual is infective, it is of strain index k. For QSD u this is given by Under the equivalence relation described in Section 3, we can have a maximum of N strains in a population of size N , and so the strain index ranges over The susceptible population is drawn from the total strain marginals of the QSD u K (k): the probability that under the QSD that a given individual is of strain index k.
Finally, we require the probability that a randomly chosen individual drawn from the strain marginal will be susceptible to strain k (i.e. will have a strain index lower than k): Since u I and u L are both probability mass functions, 0 ≤ u T L u I ≤ 1 and so we have that R H ≤ R Q ≤ R 0 . As θ → 1 then R Q → β/γ = R 0 , as does R H .

Simulation Study
To further explore the E-SIRS model we use the SMC sampler described • The expected total number of active strains E Q [K] in the QSD where K = |{k : I k > 0}|.
• We also look at how varying the model parameters affects strain diversity in infectives and the whole population.
We will focus on the E-SIS model, but also discuss for each statistic how the addition of an immune period, as in the E-SIRS models, changes the number of infectives and strain diversity. Unless otherwise stated, all expectations over the QSD were produced with the SMC sampler described in 4.3 with M = 100 particles and resampling threshold λ = 0.4.

Expected number of strains
We investigated what happens to the expected number of active strains,  Figure 6 shows the expected number of strains for fixed βθ (mutation contact rate) and β(1 − θ) (non-mutation contact rate), as β and θ vary. Note that for Figure 6(b), both θ and β increase from left to right, whereas, to maintain fixed βθ, β decreases as θ increases. In the case when βθ is high, one might expect E Q [I] and E Q [K] to be closer in value since there is a high probability of mutation leading to a high number of co-circulating strains. This is demonstrated in Figure 6(a), where we see that for fixed βθ, the number of strains is larger (and therefore closer to the number of infectives) for the βθ = 2 line than for the βθ = 0.05 line. In Figure 6(a) there is a maximum point for the number of strains as β increases, after which the number of strains decreases.
As β(1 − θ) increases in Figure 6(b), the number of strains becomes more linear in θ.

Strain diversity
In Figure 7 Figure 7(b) shows the change in strain diversity in θ. As θ increases, the number of strains present increases, so the strain diversity curve flattens out. For high values of θ, a larger lag is observed between the infectives and the whole population, due to the higher diversity. In Figure 7(c), the effect of population size is explored. As N increases, we observe a wider number of strains, as one would expect given E Q [K]'s behaviour. However, unlike the behaviour as β changes, the peak moves away from 0 but the lag between the infectives and the rest of the population appears more consistent. For the E-SIRS model explored in Figure 7(d), the immune period reduces strain diversity by reducing the expected number of infectives.
In applications, one might wish to look further into the lag between the strain distribution of the infectives and the immunity in the population. If  a pathogen has a long lag, then vaccination can be effective in updating the immunity present in the population. However, if the lag is short, then a vaccine based on a recent strain will have little effect in increasing the levels of immunity in the population, as the most immunity profiles in the population will already represent the currently circulating pathogen.

Conclusions
In this paper we defined an epidemic model for a pathogen undergoing genetic drift, that lies between the well-studied SIR and SIS epidemic models. The model appears to capture some qualitative aspects of the evolution of strains of influenza A, despite depending on just 4 parameters. Compared to models used by [12] and [31], which require the storage of a whole antigenic history, our model is much simpler, which makes simulation, computation and inference much easier. Despite these simplifications, the simulated genetic trees in Figure 2 show similarities to the tree for H3N2 in Extended Data Figure 9(c) of [12]. The relative simplicity of our model enables analytical insights into model behaviour, such as the relationship between our models and the SIS and SIR models discussed in Theorems 11 and 10. A simulation study showed that there is a nonlinear tradeoff between mutation and infectivity when trying to estimate the number of co-circulating strains under quasistationarity. The development of a quasistationary reproduction number R Q also allows summary the expected behaviour of an epidemic under quasistationarity, by comparing it to a household epidemic model. Clearly this work could be reduced to a finite state space of strains, but also include more complex strain evolution models, accounting for similarity of strains conferring some amount of partial immunity.