Epidemic highs and lows: a stochastic diffusion model for active cases

We derive a stochastic epidemic model for the evolving density of infective individuals in a large population. Data shows main features of a typical epidemic consist of low periods interspersed with outbreaks of various intensities and duration. In our stochastic differential model, a novel reproductive term combines a factor expressing the recent notion of ‘attenuated Allee effect’ and a capacity factor is controlling the size of the process. Simulation of this model produces sample paths of the stochastic density of infectives, which behave much like long-time Covid-19 case data of recent years. Writing the process as a stochastic diffusion allows us to derive its stationary distribution, showing the relative time spent in low levels and in outbursts. Much of the behaviour of the density of infectives can be understood in terms of the interacting drift and diffusion coefficient processes, or, alternatively, in terms of the balance between noise level and the attenuation parameter of the Allee effect. Unexpected results involve the effect of increasing overall noise variance on the density of infectives, in particular on its level-crossing function.


Introduction
Let us start by looking at the top of Figure 1, which shows data for Covid-19 active cases in Utah over time [14].Notice that epidemic outbreaks occurred alternating with relatively long periods of low infective levels.Publicly available data suggests that this trend is repeated elsewhere, see lower panels in Figure 1.The data consistently shows the same general features, while the periods of outbreak are different and appear to depend on details not easily captured in an overall deterministic model.We will argue here that a simple, well-motivated, stochastic model can replicate, qualitatively in terms of distribution, the general behaviour observed in the data in Figure 1.We believe this is the first time these features of an epidemic have been captured by such a model.
Let us consider an airborne pathogen that produces a disease in a large population consisting of many susceptible and fewer infected individuals, and where those who become infected are almost immediately infectious.The risk for a susceptible to inhale pathogens from an infectious individual is higher when they are at close physical range [1,15].We will We observe that the number of active cases switches from low to relatively high, then low for a period of time and high again.The scale obscures daily random fluctuations.Data from the Utah Department of Health [14].Centre and bottom: Estimated active cases of Covid-19 in other locations.We observe a similar phenomenon at different scales (centre).At the same scale (bottom), we see the appearance of outbreaks at different times.Pictures taken from [16].
refer to the effective density of active infected individuals, called infectives, as the density of infectious active cases for which the probability of transmitting the disease is positive, or possibly above some threshold.Because of the scale, the values in Figure 1 can be interpreted as densities.The overall goal of health authorities is to curb the disease incidence by establishing policies to reduce the probability of transmission and hence to keep the density of infectives at a low level.Control policies, which we do not model explicitly but are represented by a capacity factor, may include the enforcement of increased distance between individuals in public places and the cancelling of crowded events.
The spread of pathogens and production of new infectives reminds us of the demographic Allee effect in ecology, in which the rate of population growth per individual increases with population density.In our context, an individual becomes infected when the infectious dose is reached, that is, the minimum amount of pathogen needed within the organism to trigger the disease.For this to happen, exposure to one or more infectious individuals is needed.Hence, the pathogen thrives in environments with close proximity between individuals, where the infectious dose is likely to be exceeded.Moreover, it has been observed that the attenuated Allee effect, [7], together with noise, may produce realizations which remain at low densities for exponentially distributed times.Our present model also includes a recovery rate of infectives and a low immigration rate.
In this paper, we assume that the evolution of the density of infectives in an epidemic can be represented as a real-valued, continuous time process, which evolves as a stochastic diffusion with drift and diffusion coefficients arising from the above modelling considerations, as detailed further in Section 4. The assumption that we have a stochastic diffusion entails that the stochastic aspect may be considered as an amalgamation of several small, relatively independent stochastic effects, of which more later.It turns out that stochastic simulations of the discretized process show, qualitatively, the essential features of long-time Covid-19 data from sources such as the data shown in Figure 1.
From the Kolmogorov (or master) equation of our diffusion model, we find the stationary distribution of our process, which indicates the relative time spent at high and low values.We discuss how the dynamics of the model, which combines the attenuated Allee effect, introduced in [7] (see also [12]), stochasticity, and control, produces the observed variable periods of relative remission alternating with outbursts of disease.Our model is intended to produce, in distribution, qualitative aspects of the transmission of airborne diseases, and not to produce quantitative predictions.
In Section 2, we motivate and derive our stochastic model.In Section 3, we show sample paths from simulations at different noise levels, and find that a sufficiently high noise level is required to mimic the data.Derivation and examination of first passage times for our model re-enforce this rather surprising conclusion.In Section 4, we discuss in more detail how our model is motivated by and consistent with observations about government decisions, public behavioural reactions, and evolution of the virus.The qualitative success of the model yields an understanding of how the main features of the long-term data on epidemic outbreaks, consistent across various states and countries, may be created.

Model description
Let us assume that a pathogen spreads in a relatively large population, predominantly constituted of susceptible individuals, within some defined area.The disease is transmitted to a susceptible by close proximity with an infectious one.Infectious individuals, whom we call infectives, are removed after the normal course of infection.Let us denote by I = I(t) the density of infectives, considered as a smooth function of time.
We assume that the relative rate of new infections, I /I, at low densities, I, is proportional to I q with 0 < q 1, where a fixed parameter q allows the gradual change of this rate, from sharp growth for values of I close to zero to moderate growth.This means that the relative rate of new infected individuals is enhanced by the presence of other active infectives that, in our application, spread small respiratory aerosols and droplets carrying the pathogen.The value q = 1 is used in a similar expression for the well-established Allee effect [7].We will see that only values of 0 < q 1 play an important role in the present context.As will be explained in Section 3, the sensitivity to q of the dynamics of the model for small q, leads us to focus on q > 0. Due to control measures, the incidence of infectives will taper off when the number of cases becomes relatively large.We achieve this in the model by introducing a factor G, which controls disease production.The reproductive term, as it depends on I, then has the form where r > 0 and K > 0 are constants.The parameter K plays the role of a 'carrying capacity' and is related to the density of infectives that triggers the declaration of a health infrastructure emergency.The nature of the function G, and various versions of it, has been developed in existing literature, see Section 4. Here, we take G(K, I) = 1 − I/K, which simply expresses the tapering off of new cases in response to a policy of limiting I for practical reasons.It would be expected that the healthcare system is under pressure well before I reaches the value K by timely implementing control measures, including isolation and quarantine of infected individuals.When the density I increases, its rate of growth is gradually depressed by the factor 1 − I/K, showing the reduction of disease production due to control strategies.
We include in the model a constant increase of I due to immigration of infected individuals, h > 0, and a constant removal rate, μ > 0. Combining these with (1) gives us the differential equation The positive immigration rate precludes the extinction of the disease, preventing the origin from being an absorbing equilibrium state.Furthermore, Equation (2) either has one positive (globally) stable equilibrium or three positive equilibria (two stable and one unstable in between), the former case occurring in particular for h > 0 and relative small values of μ, r, and q.If h = 0, then Equation ( 2) is the same as Equation ( 4) in [7], which describes the attenuated Allee effect, with intensity described by the parameter q, and the process I has an attracting equilibrium at zero, made into a reflecting boundary in that model.The deterministic Equation ( 2), with a stable equilibrium, cannot be a model for the sort of data we see in Figure 1.However, it turns out that from a well-motivated stochastic version of the model ( 2), the resulting stochastic dynamics do produce sample paths with the desired features.
We will use the potential function, U(I), associated with Equation ( 2), as an aid in understanding the system dynamics in the next Section.It is obtained by solving the equation dI/dt = −dU/dI, which gives us Potential functions in classical mechanics describe the potential energy levels of a body at different positions with respect to a force field.In biology, the system dynamics plays the role of a force field.We follow the suggestion made in [4] of using the potential function to investigate the time that the population of active infectives, in the deterministic model ( 2), needs to reach a particular value, say Î, starting at a level I 0 and with Î > I 0 .This is given by , where = d/dI.For instance, if the integral is positive and the potential function U is nearly flat between I 0 and Î, then the solution to the differential Equation ( 2) takes a long time to reach the value Î, starting at I 0 .We use this in what follows.

The stochastic model
We will focus on randomness originating from sources external to our system, called environmental noise in ecology.We consider the parameter r in Equation ( 2) to be random, with mean m and variance σ 2 , and the other parameters to be constant.External sources of stochasticity would include erratic changes in human behaviour (instability of behaviour of the population when knowledge of the value of I is available), shifting disease prevention policies, uncertain effectiveness of vaccines, and random, minor but cumulative, variations in pathogen strains.Occasionally, these variations lead to dominant strains emerging at random times.We assume that the combined effect of the sources of stochasticity is that of adding independent small random quantities, with variability proportional to the size of the reproductive term, itself a function of I. Following [7], let us redefine I = I(t) as a diffusion process that satisfies a stochastic differential equation with drift coefficient taken from the deterministic process defined by (2), and the diffusion coefficient given by √ V(I), where the variance Omission of the terms h and μI from V(I) corresponds to the assumption that the variation in h and μ is relatively minor.Accordingly, our diffusion process satisfies the stochastic differential equation where dW represents an increment of a standard Brownian motion.Each increment of the stochastic r-term is represented by the deterministic (drift) increment plus a stochastic increment with zero-mean normal distribution with standard deviation proportional to the instantaneous mean of the term in the drift process (for details, see [8], where this argument is detailed for the SI and SIR epidemic models).We notice when I is small, the diffusion coefficient in ( 6) is small, and so is the drift coefficient.Only the small immigration rate, h, keeps the process 'alive'.Otherwise, zero would be an absorbing state.

Results
Simulations of the process I, defined by a discretization of (6), were obtained by using the Euler-Maruyama scheme, [9], see Figure 2. The parameter value m = 0.1 in the three panels was chosen in agreement with estimates from real data for Covid-19, [12], 1/μ is on the average equal to 10 days.The parameters K = 1000, q = 0.05 and h = 0.1 were chosen arbitrarily, but reasonably in view of the analysis in Section 3. We emphasize that we have not tried to fit our model to any particular data.These parameters are the same for the three panels of Figure 2, but σ , which controls the overall external noise level, has values σ = 0.15 for panels (a) and (b), and σ = 0.06 for panel (c).There is only one positive global stable equilibrium of the deterministic model, located at I * ≈ 243, and shown in dashed lines in all the panels.In the top and centre, we observe that a relatively large level of external noise, σ = 0.15, produces realizations of the process that appear to spend long times at low densities.When we reduce the value of σ to 0.06, the stochastic path moves to a neighbourhood of the attractive equilibrium I * , and oscillates around it, as shown in panel (c).We come back to this point in Section 3.2, where we look at this same effect in terms of expected first passage times.We will see that larger σ produces longer expected first passage times of I(t) across levels of I.In particular, for larger σ , the paths for I(t) spend more time near I = 0.
To help understand the behaviour of the simulated paths, we plot in Figure 3 the drift coefficient function M(I) and the diffusion coefficient function √ V(I), together with the potential function U(I) obtained in Equation ( 3) for the deterministic model with q = 0.05.In the top panel, the drift coefficient M(I) is positive if I < I * and negative if I > I * .On the whole range, the diffusion coefficient √ V(I) increases nearly linearly from zero, for σ = 0.15 and for σ = 0.06.By looking at the behaviour of the paths in Figure 2(a,b) together with Equation ( 6), for small values of I, say below I = 30 on the scale used, we see that the small jumps in both directions due to the noise in the discretized simulation appear to just balance the also small effect of the drift term.After some random time, there is enough of a positive excursion that, arguably, the positive Allee effect, together with the increased random jump size, causes the instigation of the next outburst, in our epidemic context.When the path reaches the upper half-range between zero and I * , the jumps may become large enough to push the path beyond I * , where the curve for M(I) is negative and steeper.This means that the path may reach a peak above I * and then turn down abruptly, as seen in Figure 2.
We can look at the potential function, U(I), to help visualize the dynamics: for small values of I the slope of U(I) is low, meaning that the path will linger in that range, a behaviour that is maintained with the introduction of stochasticity, as described above.If the path is eventually pushed beyond I * , where the slope in the 'bowl' of U is steep towards I * and the I-increments large, see Figure 3, it is likely that the path will have large negative increments.
If the external noise is reduced, σ = 0.06, orange dashed-dot line, then the diffusion coefficient is reduced by more than a half.This means that the increments of the discretized process at the ranges of low I are smaller, allowing the drift, i.e. the deterministic dynamics, to be more dominant.Hence, the path will reach the vicinity of the equilibrium I * faster and will continue fluctuating around it, as observed in Figure 2(c).This result may be counterintuitive, in that we might expect larger σ to produce larger values of the density-of-infectives process, I.

Stationary distributions
In Figure 2, we see that the paths of the process I, for values of q close to zero and a relatively large value of σ , spend relatively long periods of time at low levels and little time close to the equilibrium.Figure 3 has helped us to understand the features of Figure 2. In particular, for small σ , the paths are likely to fluctuate around the equilibrium point, which is not the observed behaviour.Next, we compute the stationary density, denoted by f, for the process I(t).The function f provides us with information about the proportion of time the process would spend in various states, if it were observed for a long period of time.This distribution is obtained from the Kolmogorov forward equation, where M and V are defined in ( 4) and ( 5).The origin acts as a reflecting boundary because of the positive immigration, see [6,13] for detailed treatments.If f exists, it must satisfy the relation obtained by setting the time derivative in the Kolmogorov equation equal to zero, from which (obtained by using Wolfram Mathematica), where C is the normalization constant and 2 F 1 (a, b; c; z) is the ordinary hypergeometric function, which converges for |z| < 1 and if c is not a negative integer.We saw in Section 2 that with the drift and diffusion coefficient functions plotted in Figure 3 and for values of q close to zero and the larger value of σ , the sample paths in Figure 2(a,b) spend relatively long periods of time at low levels and less time close to equilibrium.For small σ , the paths tend to fluctuate around the equilibrium point, I * .To relate these observations to the stationary law f, defined in (7), of the process I(t), we compute and plot it Figure 4, for q = 0.05, Figure 4(a), and q = 0.1, Figure 4(b), for noise levels σ = 0.15 and σ = 0.06.We observe, despite the difference of scales, that for larger q the mean of f is larger, for both values of σ , indicating that the process I(t) spends more time at higher values, as expected from the simulations shown in Figure 2.
The Allee effect, the sharp increase in the relative rate of new infections, for very small q, being modulated as q increases, has its effect on sample paths when I is small, but its reflection in the stationary law would be difficult to guess without the computations and plotting.
Notice in Figure 4(a,b) that the smaller noise causes the normal-looking stationary distribution (orange) to tighten around the equilibrium point (of the deterministic model, (2)).With the larger noise (blue) the stationary density looks exponential-like when q is very small but flattens toward a more uniform density when q is larger.Figure 4(c), showing the potential function U(I) for several values of q, reveals that decreasing q causes the equilibrium, i.e.where dU/dI = 0, to decrease rapidly.This means that the influence of the deterministic part of the model in the dynamics, pushing the paths towards I * , becomes larger as q increases.In Figure 4(d), we show, for completeness, what happens with q = 0 and the other parameter values as in Figure 2(a,b).In this case, the deterministic equilibrium, I * , is severely reduced to about 31 and, correspondingly, the sample path varies among small values of I, and outbursts are unlikely.

First passage times
Here, we show in terms of expected first passage times across levels of the density of infectives, I(t), that our model produces larger first passage times with larger noise coefficient, σ .We saw this rather surprising effect in Figure 2, and understood it in terms of stochastic path dynamics at the beginning of this section with the help of Figure 3.
The first passage time, T x , is the time at which the process I(t) first reaches the value x, starting from I(0) = I 0 .The threshold, x, could be the point at which the density of active cases reaches epidemic status.Thus, T x is defined as Since the model, (6), is stochastic, T x is a random variable.The expected value of T x is known as the mean first passage time.In terms of the initial state I 0 < x, it can be expressed as in [6], where f is the density function of the stationary distribution and V is defined in (5).In Figure 5, we compare the functions E(T x ) for several values of I 0 , for two noise levels, σ = 0.15 and σ = 0.06.We observe that mean passage times are greater, i.e. passages through levels take longer for the higher noise level than for the lower one.This observation is consistent with earlier results of Sections 2 and 3.More noise, i.e. greater σ , causes the process I(t) to spend more time at low levels, as seen in Figures 2 and 4(a,b).

Remaining questions
We believe that this is the first paper to introduce a one-dimensional stochastic model of this type to describe an endemic process with long-term low values interspersed with outbreaks.While the model is, we believe, well-motivated and produces essential features of the data for our suggested numerical parameter values, challenging problems remain.It is not apparent, for example, how numerical values of the interacting parameters K and q might be extracted from data.In fact, the vertical scale of our sample path plots, which is sensitive to K and to q, is in units of 'density of infectives' which can be related to the scale of data only in a particular example.The more specific interpretation of the parameters and ideas for extracting numerical parameter values are remaining questions.Our model does not explicitly include environmental pressure due, for instance, to meteorological or social factors, such as periodicities in weather or activity schedules.The dependence of the results on the parameters K, h and μ in (6) remains to be considered.

Conclusions and discussion
There are many other useful epidemic models which are constructed to answer various specific questions.Here, our question is can we understand the common form of pandemic data in different states and nations, showing numbers of cases as a function of time over long time periods?As an answer, we introduce a novel stochastic epidemic model which, we argue, successfully incorporates many contributing factors.In order to be useful for control or prediction, much work would remain: to formulate specific further questions, to gather specific data, and estimate parameters.For example, if we ask how a period of low infectives can be extended, we could infer from this model, or indeed use 'common sense', to see various answers.
Our model replicates a striking characteristic observed in much data, namely, that the progression in time of active cases can remain at low levels for relatively long periods of time before (and after) relatively large outbreaks.Now let us attempt to explain more explicitly how our stochastic model produces the randomly-timed pattern of outbursts and quiet periods seen in data from real long-term epidemic records, and how the dynamics of the model correspond to what we know about this biological example.
Long periods at low levels of active cases appear in our model, and also, arguably, in our example, because of the interplay of two factors: (1) new active cases are produced at an increasing but moderate rate (small drift), and (2) the diffusion coefficient in (6), which includes I as a factor, produces fluctuations of relatively small size.Although it might be expected that stochasticity introduces the possibility of disease extinction, positive immigration, h > 0, prevents it.Hence, the process fluctuates and remains at low levels for random periods of time.
Next, we describe in more detail how our model is motivated by what we have learned in Covid times about epidemic realities and show how tightly our model follows the known sources of the random highs and lows observed in real data.The deterministic skeleton of the model, (2), adapts ideas from the equations that define an attenuated Allee effect, [7], where the rate of increase is regulated by a relatively small parameter, q.Specifically, in our epidemic context, we assume that the relative rate of change of the density of infectives is proportional to the density of infectives raised to a power q > 0. The justification in the epidemic example, behind this model assumption, is that an increased density of infectives favours the amplification of the disease.In our epidemic example, this could be due to larger quantities of respiratory aerosols and droplets carrying the pathogen in the air [15].As a consequence, occasionally at random times, even when variance is low, a new strain produced by accumulating mutations or other random events leads to increased levels of infectives.Then, increased noise may launch an excursion of the epidemic process past the equilibrium level.The situation can be exacerbated in our example at a low count of active cases by a decreased perception of risk among authorities as well as the general population, which leads to reduced physical proximity.This seems consistent with data collected worldwide for Covid-19, where an uptick in new cases is often observed after control policies are relaxed and when new strains appear.Subsequently, when a new outburst threatens health systems, control measures are increased, the probability of transmission is reduced, and active cases start decreasing.This is an important aspect manifested in the dynamics of model ( 6), namely, when I > I * , the function M(I) is negative and √ V(I) is large, which brings the value of the process quickly downwards.
We point out that the assumption that the coefficient r in (1) is random is consistent with [12], Section 2, where they plot the value of r (for their logistic model) from several data sets.Their plots suggest that r is random with a fairly stable mean and a variance which might be compared with our values of σ 2 .Their generalized model, Equation ( 14) in [12], contains an exponent α which plays the role of our 1 + q.They simulate their main model and compare with data only for cumulative curves, which conceal the details of the dynamics.
We emphasize that the model presented here assumes a large population consisting almost entirely of susceptibles and fewer infectives.This allows us to neglect the reduction in the susceptible class as infections occur, an important consideration in many epidemic models, [5].Our model could also represent the spread of a disease that does not confer immunity after infection.Notice that the rate (1) is reminiscent of the standard incidence term rIS/N, widely used in modelling the spread of infectious diseases, with S denoting the size of the susceptible class out of a total population of N [2].Generalizations of this incidence to rI q S p and rI q S/(1 + αI p ) are studied in [10,11], where it is shown that these forms give rise to rich dynamics of deterministic epidemic models.
In a typical stochastic dynamic model, the stochastics moves the path around the state space between neighbourhoods of the attractive equilibrium points of the corresponding deterministic process.An aspect of the stochastic dynamics in our model is that the process-dependent noise repeatedly moves our sample paths from a (positive) neighbourhood of 0 to a neighbourhood of the single attractive equilibrium, I * , and back again to the neighbourhood of 0. Here, the dynamics is such that 0 is not a stable equilibrium, but never-the-less a positive neighbourhood of 0 acts as a quasi-stable region.
In their Epilogue, Fred Brauer, Carlos Castillo-Chavez and Zhilan Feng state that one of their objectives in their 2019 authoritative book 'Mathematical Models in Epidemiology' [3], is to highlight the communality of modelling approaches in the fields of ecology, epidemiology, and population biology.We would like to express our enthusiasm for this objective and our confidence that this communality extends to stochastic modelling.

Figure 1 .
Figure 1.Top: Estimated active cases of Covid-19 in Utah between 1 March 2020 and 23 March 2022.We observe that the number of active cases switches from low to relatively high, then low for a period of time and high again.The scale obscures daily random fluctuations.Data from the Utah Department of Health[14].Centre and bottom: Estimated active cases of Covid-19 in other locations.We observe a similar phenomenon at different scales (centre).At the same scale (bottom), we see the appearance of outbreaks at different times.Pictures taken from[16].

Figure 2 .
Figure 2. Panels (a) and (b) are realizations obtained from simulating Equation (6) (blue) with σ = 0.15; panel (c) shows a realization with σ = 0.06.The smoothed simulated data appears in red.With the parameter values for m, K, μ, q, h used in these simulations there is only one equilibrium point for the deterministic model (2), I * = 243.34,which is positive and stable (dashed lines).When σ = 0.15, the path spends considerable time at low levels of active infectives, occasionally rising to an 'epidemic' level above I * .In panel (c), in contrast, for the relatively small value σ = 0.06, the path shown raises to and fluctuates around the equilibrium.The parameter values are m = 0.1, K = 1000, μ = 0.1, q = 0.05 and h = 0.1.

Figure 3 .
Figure 3.The top panel shows the drift coefficient function, M(I), in blue and the diffusion coefficient function, √ V(I), in orange, for σ = 0.15 (continuous) and σ = 0.06 (dashed-dot); other parameter values are as in Figure 2, in particular q = 0.05.The bottom panel shows the potential function obtained from the deterministic part, Equation (3).The dashed lines show the position of the equilibrium point, I * ≈ 243, for Equation (2).The drift is positive for I < I * and negative for I > I * .The potential function, U(I), is nearly flat near the origin.

Figure 4 .
Figure 4. Plots of the stationary law, f (I) of (7) for combinations of q and σ .In (a) q = 0.05, in (b) q = 0.1.In both, σ = 0.15 solid curves (blue) and σ = 0.06 dashed curves (orange).The other parameters are as in Figure2.When σ decreases, the probability mass moves right for both values of q, accumulating around the equilibrium point.When σ = 0.15, and when q increases, we see that the probability mass moves to the right, to higher values of I. (c) We see the potential function, U(I), for different values of q.In the region between I = and I * , U(I) has a larger negative slope as q increases.(d) A realization of the process, with σ = 0.15 and q = 0.The equilibrium is located relatively low, I * = 31.6,with the dynamics likely remaining at low levels, see the corresponding U(I) in (c).

Figure 5 .
Figure 5. Mean first passage times for the process I(t) with different initial values, I 0 .The values for m, K, μ, h, and q are the same as in Figure2.In (a), σ = 0.15, in (b), σ = 0.06.Comparing the two panels, we observe that with higher levels of noise, (a), the process stays, in the mean, for longer times at low densities, whereas for the lower level of noise (b), the process reaches larger values faster.