Fundamental bound on epidemic overshoot in the SIR model

We derive an exact upper bound on the epidemic overshoot for the Kermack–McKendrick SIR model. This maximal overshoot value of 0.2984 · · · occurs at R0∗=2.151⋯. In considering the utility of the notion of overshoot, a rudimentary analysis of data from the first wave of the COVID-19 pandemic in Manaus, Brazil highlights the public health hazard posed by overshoot for epidemics with R0 near 2. Using the general analysis framework presented within, we then consider more complex SIR models that incorporate vaccination.


Introduction
The overshoot of an epidemic is the proportion of the population that becomes infected after the peak of the epidemic has already passed.Formally, it is given as the difference between the fraction of the population that is susceptible at the peak of infection prevalence and at the end of the epidemic.Intuitively, it is the difference between the herd immunity threshold and the total fraction of the population that gets infected [7,4].As it describes the damage to the population in the declining phase of the epidemic (i.e. when the effective reproduction number is less than 1), one might be tempted to dismiss its relative importance.However, a substantial proportion of the epidemic, and thus a large number of people, may be impacted during this phase of the epidemic dynamics.
A natural question to ask then is how large can the overshoot be and how does the overshoot depend on epidemic parameters, such as transmissibility and recovery rate?Surprisingly, within the framework of the SIR ODE model, this question can be answered exactly.In this paper, we first derive the bound on the overshoot in the Kermack-McKendrick limit of the SIR model [8].Beyond the basic SIR model, we then see if the bound on overshoot holds if we add additional complexity, such as vaccinations, into the model.We then numerically study what happens when we move to a more realistic structural model of epidemics that incorporates contact heterogeneity.
Figure 1: The overshoot can be calculated in two ways: a) Overshoot is calculated as the difference between the fraction of the population that is susceptible at t * and infinite time.b) Overshoot is calculated as the integral of the infection incidence curve from t * until infinite time.Therefore overshoot corresponds to the area of the region shaded in yellow.

Results
The classical Kermack-McKendrick SIR model is given by the following set of ODEs: where S, I, and R are the fractions of population in the susceptible, infected, or recovered state respectively.As these are the only possible states within this model, the conservation equation for the whole population is given as Conceptually, the overshoot can be equivalently calculated in two ways.In the first it is given by the difference in the fraction of susceptible individuals at the peak of infection prevalence (S t * ) and at the end of the end of the epidemic (S ∞ ) (Figure 1a).Alternatively, it can be viewed as the integration of the number of newly infected individuals, which is given by the infection incidence curve ( dIncidence dt = βSI) from the peak of infection prevalence to the end of the epidemic (Figure 1b).We will make use of the former relationship in the results that follow.
The only two parameters of this model are β and γ.A key parameter in epidemic modeling combines these two into a single parameter by taking their ratio, which is known as the basic reproduction number (R 0 ).While overshoot is a function of both β and γ independently, the behavior of the overshoot can be parameterized in terms of the single parameter R 0 (Figure S1).Plotting the dependency of overshoot on R 0 (Figure 2), we observe a peak in the curve at (R * 0 , Overshoot * ) that sets an upper bound on the overshoot.From a public health perspective, diseases that have estimated R 0 's near this peak region in Figure 2 include COVID-19 (ancestral strain) [2], SARS [16], diphtheria [14], monkeypox [6], and ebola [15].This peak phenomena in the overshoot was first numerically observed by [17], though not explained.We will now derive the solution for this maximum point analytically.

Deriving the Exact Bound on Overshoot in the Kermack-McKendrick SIR Model
Theorem: The maximum possible overshoot in the Kermack-McKendrick SIR model is a fraction 0.2984... of the entire population, with a corresponding R * 0 = 2.151....

Proof.
Let t * be the time at the peak of the infection prevalence curve.The herd immunity threshold is the difference in the fractions of the population that are susceptible at zero time and at t * .Define overshoot as the difference in the fractions of the population that are susceptible at t * and at infinite time.This is equivalent to defining overshoot as the cumulative fraction of the population that gets infected after t * .
Overshoot where S t * and S ∞ are the susceptible fractions at t * and infinite time respectively.We will use S t * = 1 R0 [10], which can be obtained by setting (2) to zero and solving for that critical S. We will use the notation X t to indicate the value of compartment X at time t.
Since we would like to compute maximal overshoot, we can differentiate the overshoot equation ( 5) with respect to S ∞ to find the extremum.We will eliminate R 0 from the overshoot equation so that we have an equation only in terms of S ∞ .
To find an expression for R 0 , we start by deriving the standard final size relation for the SIR model [1,3].We solve for the rate of change of I as a function of S using ( 1)-( 2) to obtain from which it follows on integration that S + I − γ β lnS is constant along any trajectory.Considering the beginning of the epidemic and the peak of the epidemic yields: We now define the initial conditions: S 0 = 1 − ϵ and I 0 = ϵ, where ϵ is the (infinitesimally small) fraction of initially infected individuals.We use the standard asymptotic of the SIR model that there are no infected individuals at the end of an SIR epidemic: I ∞ = 0 .Hence, recalling that R 0 = β γ we obtain that The resulting equation ( 7) is the final size relation for the Kermack-McKendrick SIR model.
Rearranging for R 0 yields the following expression: We then substitute this R 0 expression (8) into the overshoot equation (5).
Differentiating with respect to S ∞ and setting the equation to zero to find the maximum overshoot yields: whose solution is
using (9).The corresponding R 0 calculated using ( 8) is This concludes the proof ■ Additionally, to find the total recovered fraction is straightforward.In the asymptotic limit of the SIR model, there are no remaining infected individuals, so In other words, approximately 5 out of every 6 individuals in the population will have experience infection when overshoot is maximized.

Upper Bounds on Overshoot in Models that Include Vaccinations
Beyond the Kermack-McKendrick SIR model, one can ask if the bound on overshoot still holds if other complexities are added to the model.First, we will consider the addition of vaccinations.
We will consider three qualitatively different types of curves for the vaccination rate (Figure 3).These correspond to different scenarios that might be modeled.The first model assumes a vaccination rate of zero after the outbreak begins, which implies all vaccinations occur before the outbreak.The second model of vaccination assumes a constant per-capita vaccination rate.This is a situation where all susceptible individuals get vaccinated at the same rate.This assumption yields a vaccination curve for the population that is concave down.The third type of model assumes a risk-driven vaccination rate that depends on the number of infected individuals.This yields a non-monotonic vaccination curve for the population that switches from being initially concave up to being concave down.Depending on the scenario being analyzed, one model might be more appropriate to use than others.Below we discuss each model in further detail by providing the corresponding system of equations, relevant scenarios the model might correspond to in reality, and the corresponding maximal overshoot for each model.

Maximal Overshoot when the Number of Vaccinated Individuals is Constant
The first model of vaccination assumes there are no vaccinations during the outbreak, which implies a fixed number of vaccinated individuals over the course of the epidemic.Such a scenario might be the reintroduction of an infectious disease into a population that has a pre-existing level of immunity.
Since the number of vaccinated individuals is constant, this implies all vaccinations occurred prior to the initial time step.The calculation is then trivial assuming vaccinations provide complete and permanent immunity.In that case, vaccinated individuals can simply be ignored entirely in the dynamics, resulting in the maximal overshoot simply scaling with the unvaccinated fraction.

Maximal Overshoot Under Addition of Constant Per-Capita Vaccination
We next consider a more typical scenario where the vaccination rate per unvaccinated individual is constant per unit time.Barring any additional information about the population or the epidemic, it is reasonable to assume that all susceptible individuals are vaccinated at the same rate.Consider the following SIRV model: In this case, it is easily shown that there is a conserved quantity, S + I − γ β lnS + λ β lnI, which reduces to (6) when the vaccination rate is zero (i.e.λ = 0).Unfortunately, having the conserved quantity is not sufficient to compute the overshoot, since there does not appear to be a way to separate infected and vaccinated individuals when trying to extend the previous calculation.Therefore, we turn to numerical computation (Figure 4a).We find that the maximal overshoot is bounded above by the value already obtained in the model without vaccinations.As shown in Figure 4, the overshoot has a complicated dependence on the vaccination parameter λ and R 0 .

Maximal Overshoot Under Addition of a Risk-Driven Vaccination Rate
Lastly consider a vaccination rate that is proportional to the number of infected individuals.Such risk-driven behavior may arise for a variety of reasons, including initial vaccine hesitancy, a delay in vaccine availability, or a correlation between willingness to get vaccinated and the number of infected individuals.Consider the following SIRV model: Since the model now has an additional compartment, V, compared with the original SIR model, we must update our definition for overshoot accordingly.Fundamentally, overshoot compares the fraction of people who have not been infected at the epidemic peak and the people who have not been infected at the end of the epidemic.The fraction of people who have not been infected at any particular time, t, is S t + V t .Thus, overshoot can be redefined as follows.
Since the equation for dI dt remains unchanged, S t * = 1 R0 still applies.Thus, the overshoot equation for models with vaccinated compartments is given by: To maximize overshoot, we thus need to find expressions for R 0 , V t * , and V ∞ in terms of S ∞ .
To find R 0 we start by taking the ratio dI dS and integrating as before.It follows that I + β β+λ S − γ β+λ lnS is constant along any trajectory.Considering the beginning and the end of the epidemic yields: Using the same initial conditions, asymptotic behavior, and parameter substitution as before (S 0 = 1 − ϵ, I 0 = ϵ, I ∞ = 0, R 0 = β γ ) yields the following final size relation.
Thus, we see that R 0 for this SIRV model takes on the same expression as the SIR model (8).
To find V t * , let us take the ratio of time derivatives of the S and V compartments (19), ( 22), from which it follows on integration that S + ( β+λ λ )V is constant along any trajectory.Considering the beginning and the peak of the epidemic yields: Using the initial conditions (S 0 = 1 − ϵ, I 0 = ϵ, V 0 = 0) and recalling that S t * = 1 R0 , we obtain the following formula for V t * .To find V ∞ , recall that S + ( β+λ λ )V is constant along any trajectory.Considering the peak of the epidemic and the end of the epidemic yields Using the equation for V t * (25) and recalling that S t * = 1 R0 , we obtain the following equation for V ∞ .
Substituting the expressions for R 0 (24), V t * (25), V ∞ (26) into the overshoot equation (23) yields: We see that this expression for the overshoot is simply the overshoot expression for the original SIR model ( 9) scaled by a factor 1 − λ β+λ . Overshoot Since both β, λ ≥ 0, then the factor 1 − λ β+λ can never be greater than 1.This implies that the bound on maximal overshoot given by the theorem holds, becoming exact in the limit of no vaccinations (ie.λ = 0).For this model, the maximal overshoot decreases as a function of λ in a nonlinear way and has a nonlinear dependence on R 0 (Figure 5).

Overshoot Behavior Under Addition of Contact Heterogeneity
One of the key assumptions of the Kermack-McKendrick model is a well-mixed population where every infected individual has the same effect on the population.While a mathematically convenient assumption, real-world epidemics happen in more structured populations [13].This is evidenced by the phenomena of superspreaders, where some fraction of infected individuals infect a disproportionately large number of susceptibles.This is modeled in the literature through the introduction of contact heterogeneity [9,5].
Here we explore what happens to the overshoot when the contact structure of the population is given by a network graph that is roughly one giant component.While it is possible to construct pathological graphs that produce very complex dynamics, we consider more classical graphs here.Using a parameterization of heterogeneity given by Ozbay et.al. [12] (see Methods for details), we simulated epidemics on networks with structure ranging from the homogeneous limit (well-mixed, complete graph) to a heterogeneous limit (heavy-tailed degree distributions).We observed what happens to the overshoot on these different graphs as we changed the transmission probability.
On a network model, where contact structure is made explicit, the homogeneous graph is a complete graph, which recapitulates the well-mixed assumption of the Kermack-McKendrick model.It is not surprising then that the overshoot in the homogeneous graph (σ = 0) peaks also around 0.3 (Figure 6, green) and has qualitatively the same shape profile as the ODE model (Figure 2).We also see that increasing contact heterogeneity qualitatively suppresses the overshoot peak both in terms of the overshoot value and the corresponding transmission probability.Furthermore, increased heterogeneity also flattens out the overshoot curve as a function of transmission probability.

Discussion
We have proved that the maximum fraction of the population that can be infected during the overshoot phase of an epidemic in the Kermack-McKendrick SIR model is about 0.3, with a corresponding R 0 ≈ 2.15.This upper bound on the overshoot seems to hold in other extensions of SIR models as well.In an SIR model with vaccinations, the upper bound stays the same.It also matches the numerical upper bound seen for SIR dynamics on networks of varying heterogeneity.In the 2-strain with vaccination SIR model of Zarnitsyna et.al. [17], the overshoot depends on both the level of strain dominance and vaccination rate, but from their results it is numerically seen that any amount of vaccination will produce an overshoot lower than the bound found here.Different control measures and strategies may reduce the overshoot as compared to the unmitigated case [7], keeping this upper bound intact.It will be interesting to see how general this bound is for SIR models with other types of complexities or for models beyond the SIR-type.
The mathematical intuition on why there is a peak in the overshoot as a function of R 0 can be seen by an inspection of Equation 5.The first time term, 1 R0 monotonically decreases with increasing R 0 .The last term, −S ∞ , monotonically increases with R 0 .Thus a trade-off in the two terms results in some intermediate peak.The epidemiological intuition behind a peak in the overshoot is that the total number of individuals infected during the epidemic grows monotonically with increasing R 0 .However, too high of an R 0 leads to a sharp growth in the number

Figure 2 :
Figure 2: The overshoot as a function of R 0 for the Kermack-McKendrick SIR model.

Figure 3 :
Figure 3: The fraction of population that is vaccinated (V) based on different vaccination rates: a vaccination rate of zero over the course of the epidemic (blue), a constant per-capita vaccination rate (red), and a risk-driven vaccination rate (yellow).

Figure 4 :
Figure 4: a) Contour plot for the overshoot for the SIRV model with dV dt = λS as a function of λ and R 0 .b) Vertical cross-section of the contour plot from (a) for λ = 0.02.

Figure 5 :
Figure 5: The overshoot for the SIRV model with dV dt = λSI as a function of λ for different levels of β (or equivalently R 0 ).

Figure 6 :
Figure 6: The overshoot for SIR epidemic simulations on networks with varying levels of heterogeneity (σ) as a function of transmission probability.N = 1000, λ = 5.Points represent the average of n = 150 simulation runs.