Individual risk-aversion responses tune epidemics to critical transmissibility (R = 1)

Changes in human behaviour are a major determinant of epidemic dynamics. Collective activity can be modified through imposed control measures, but spontaneous changes can also arise as a result of uncoordinated individual responses to the perceived risk of contagion. Here, we introduce a stochastic epidemic model implementing population responses driven by individual time-varying risk aversion. The model reveals an emergent mechanism for the generation of multiple infection waves of decreasing amplitude that progressively tune the effective reproduction number to its critical value R = 1. In successive waves, individuals with gradually lower risk propensity are infected. The overall mechanism shapes well-defined risk-aversion profiles over the whole population as the epidemic progresses. We conclude that uncoordinated changes in human behaviour can by themselves explain major qualitative and quantitative features of the epidemic process, like the emergence of multiple waves and the tendency to remain around R = 1 observed worldwide after the first few waves of COVID-19.


Introduction
Self-initiated behaviours of individuals aware of an external risk are major determinants of the course of epidemic outbreaks [1]. Historical examples abound where epidemic spread has been inhibited without the explicit adoption of imposed institutional measures [2]. For instance, the total incidence of the Ebola outbreak in West Africa in 2014 was well below the predictions of models which included a variety of containment measures [3] but which lacked the effect of uncoordinated changes in public behaviour [4]. More recently, some models have attempted to predict the progression of COVID-19 with little success [5,6]: R (removed). Contagion does not necessarily require direct contact between individuals (e.g. as in airborne diseases). RTP is assigned to each individual i in the form of a time-dependent probability p i (t),i ¼ 1, . . ., N, that controls the frequency at which the individual becomes exposed to the risk of contagion. Also, each individual is assigned a fixed basal RTP, p 0 i , which stands for the maximum value p i (t) can reach.
In general terms, RTP is a complex, subjective and multi-factorial personal attribute. In the present model, we associate this variable with the individual's reaction to having recognized a risky epidemiological situation, determining the frequency of exposure to risk. RTP results from a combination of voluntary actions and unavoidable facts. It effectively encompasses deliberate participation of social activity, immersion in populous environments, as well as the effect of protective behaviours like wearing a mask, or non-pharmaceutical measures like avoiding physical contact and crowded closed spaces. There is also a small but unavoidable probability of becoming exposed to risk due to factors that cannot be under strict personal control, such as sharing common spaces with or receiving help from other individuals. This latter effect keeps the probability of exposure above a small but positive value. Overall, p i (t) can be interpreted as a quantity proportional to the individual's effective number of social contacts at a given time, resulting from a wide variety of concurring factors.
In our model, the evolution of individual RTP is driven by two mechanisms, which act in opposite directions. Individual RTP decreases with increasing epidemic prevalence and grows when the individual is exposed but not infected. Growth of p i (t) when exposure has not caused infection represents a reinforcement of the individual's confidence on the level of safety during exposure, leading to the relaxation of protective behaviour. This perception strengthens the tendency to risk taking, in the same way that previous gains in repeated gambling have been shown to increase the likelihood of betting [36,37]. Generically, prior success reinforces the feeling of a higher chance of subsequent success [38]. We implement the evolution of the individual time-dependent RTP p i (t) as described by the following rules: (i) First, as a response to risk perception, the RTP of susceptible individuals is uniformly inhibited, and p i (t) decreases to p 0 is the fraction of infected population (number of infected individuals divided by total population size). The positive parameter ω weights how fast RTP is inhibited, and thus quantifies the risk-aversion response to the epidemic. If the RTP drops below 0, it is reset to p 0 i (t) = 0.025ξ i (t), with ξ i (t) a random number drawn from a uniform distribution in (0, 1). We thus have (ii) Second, with a probability given by the new RTP value p 0 i (t), each susceptible individual is effectively exposed to the risk of infection. Then, (a) with probability q(t) contagion does occur and the individual moves from the susceptible to the infected class; (b) with the complementary probability, 1 − q(t), the individual does not become infected and remains in the susceptible class. In this case, p 0 i (t) increases to p i (t + 1) = p 0 i (t) + ρη i (t). The positive parameter ρ controls the speed at which risk aversion subsides, and η i (t) is a random number drawn from a uniform distribution in (0, 1). This random increase accounts for the multiple factors that make p i (t) relax back to the basal RTP. If, due to this relaxation, p i (t + 1) overcomes p 0 i , we reset p i ðt þ 1Þ ¼ p 0 i , so that the basal RTP is an upper bound for p i . In other words, Note that the infection probability per time step upon exposure has been chosen to coincide with q(t), and not just proportional to it, exploiting the fact that any proportionality constant could be absorbed by a redefinition of the time step duration or, in other words, of time units. A different choice would just entail a rescaling of all the probabilities involved in the algorithm. We stress that contagion is here not royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 due to a contact event between agents, but is an individual stochastic process governed by the momentary infection prevalence all over the population.
For a disease with a typical infectious period λ −1 , the fraction of infected population q(t) is approximately equal to the average number of individuals who became infected during an interval of λ −1 time steps. In our model, since the downward variations of p i (t) are proportional to q(t), the value of RTP after a period of decrease results from an integral of the infection incidence along that period or, in other words, from the time average of q(t). This quantity is much like the measures that seem to represent better the current state of the COVID-19 pandemic, namely, weekly or biweekly running averages of incidence. After considering several other variables (such as the number of new cases, excess mortality, number of hospitalizations or fraction of tested population), biweekly averages have emerged as a stable and representative measure [39], thus supporting our choice to couple individual reactions to q(t).
As for the initial conditions, at t = 0, the number of susceptible, infected and removed individuals in the population is N − 1, 1 and 0, respectively. Basal risk-taking propensities p 0 i are drawn from a uniform distribution in (0, 1), and we take p i ð0Þ ¼ p 0 i . Moreover, unless otherwise stated, λ = 0.05. The resulting average duration of the individual infection is λ −1 = 20 time steps. Figure 1 illustrates the characteristic dynamics of p i (t), and the transitions between epidemiological classes, which correspond to those of an SIR model. The stochastic transition from S to I depends on p i (t), while the transition from I to R occurs with constant probability λ, as in traditional SIR dynamics.

Individual response inhibits epidemic propagation and induces infection waves
Under the rules described above, a collective response of the population emerges. We define PðtÞ ¼ 1 SðtÞ  i . At each time step, individuals inhibit their RTP in an amount proportional to the level of external risk, which is given by the fraction of infected population, weighted by a parameter ω. If, upon exposure to the risk of contagion, they do not become infected, their RTP relaxes back to its basal value, controlled by a parameter ρ. (b) Exposure to risk of susceptible individuals is stochastic, so that the transition from susceptible to infected occurs with variable probability. Infected individuals are removed with probability λ.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 For ω = ρ = 0, the individuals do not respond to the progress of the infection, and maintain their basal RTP, regardless of the threat. The epidemic spreads, preferentially affecting individuals with higher p 0 i , until it reaches a peak and its incidence begins to decline. A single epidemic wave, comparable to that of a standard SIR model, results. However, when the response of the individuals is turned on (ω, ρ > 0), the generalized decrease in RTP causes a lower peak, as shown in figure 2. Consequently, the fraction s(t) of susceptible individuals when the outbreak finishes is higher than without population response. Moreover, the value of P(t) at the end of the epidemic is also higher, suggesting that a transient inhibition of RTP protects individuals with higher basal RTP.
As a function of ω and ρ, one, two or more infection waves can occur. The mechanism behind multiple wave generation can be understood by means of a mean-field approximated representation of the stochastic model, discussed in a later section. Figure 3 illustrates two situations with three (ω = 0.05, ρ = 0.25) and nine waves (ω = 0.05, ρ = 1). In fact, other parameters being equal, infection waves become more numerous as ρ grows. The fraction of susceptible individuals s(t) decreases through a series of steps produced by the succession of infection waves. The amplitude of waves becomes progressively smaller, and their period grows monotonically. Depending on the parameters, as the number of waves grows, the final decrease of q(t) slows down, and a shoulder or plateau of variable length develops.
Concomitantly with the infection waves, the average RTP P(t) oscillates in a series of inhibitionrelaxation cycles. After an abrupt drop caused by the first wave, P(t) returns to higher values as the remaining susceptible individuals increase their RTP towards the basal levels. Superimposed to the royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 waves, this relaxation occurs over an emerging slow time scale (of the order of a few thousand time steps in figure 3c). Finally, P(t) stabilizes at a fixed level until all infected individuals are removed and the epidemic becomes extinct at a time t ext , which grows with ρ. Our numerical simulations show that the average RTP and the fraction of susceptible population at the time of extinction, P ext and s ext , do not vary appreciably with ρ when the number of infection waves is two or larger (ρ > 0.1). This is illustrated in the insets of figure 3.

Multiple infection waves lead to R = 1
The effective reproduction number of an epidemic, R(t), is defined as the average number of secondary infections caused by a single infected individual over the course of the infectious period [40]. In our simulations, considering that the infectious period is given by λ −1 , R(t) can be given a step-by-step estimation as the ratio between the fraction q n (t + 1) of newly infected individuals at time t + 1 and the fraction of infected population at time t, times the average duration of the infection, that is The dynamics of our numerical estimation for R(t) in an illustrative example are displayed in figure 4a. Oscillations around R ≈ 1, which turn out to be synchronous with those of P(t) and q(t), have a large amplitude at short times due to the influence of the initial condition. At later times, R(t) displays increasingly stronger, seemingly random fluctuations, when the number of infected  royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 individuals declines (compare figure 4a with figure 3b). However, the time average of R(t) along the whole dynamics (from t = 1 to the time of extinction t ext ), has a definite value which approaches one as ρ increases and infection waves become more numerous. This trend is shown in figure 4b,c. For small ρ (to the left of the dashed vertical line), the infection ends after a single wave, and the value of R varies strongly between different realizations. As more waves develop, on the other hand, R is less disperse and, at the same time, it approaches one for increasing ρ. This robust behaviour towards R = 1 does not hold if the population ignores the external risk. If ω = ρ = 0, in fact, there is always a single infection wave that ends once R(t) has dropped below one and the number of new infections is not able to compensate the removal of infected individuals. In this case, the final value of R strongly depends on the recovery rate λ.

Epidemic shapes the collective risk-taking propensity profile
The heterogeneity of RTP over the population manifests itself along the epidemic outbreak. The probability that an individual becomes infected depends on basal RTP and on momentary risk exposure. The first infection wave preferentially affects those individuals with a high value of p 0 i . The rapid progress of the outbreak causes a collective inhibitory response, but it still generates a high number of contagions and also affects individuals with RTP well below average. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 Figure 5a shows the average RTP of individuals infected during each wave for an epidemic with many waves. The quantities P and P 0 plotted in the figure are, respectively, averages of p i (t) and p 0 i over all the individuals which, during the interval elapsed between two consecutive minima of q(t), have undergone the transition from susceptible to infected. The momentary RTP p i (t) taken into account to compute the average was recorded at the moment of contagion. The resulting values of P and P 0 are plotted at the time of minimal q(t) just after the corresponding interval. Overall, the average basal RTP P 0 of newly infected individuals decreases with each subsequent infection wave. On the other hand, P grows during the first few waves, and then decreases much as P 0 . The initial growth of P is a direct consequence of the fact that, at the very beginning of the infection, individual RTPs drop abruptly because of the fast growth in the fraction of infected population. Although individuals with higher RTP are more prone to undergo contagion, on the average, the population that becomes infected in the first few waves has a relatively low value of P. In successive waves, this effect weakens and the average RTP of the susceptible population grows, which leads to contagion with higher values of P. Progressively, however, individuals with high basal RTP are removed from the susceptible population and the momentary RTP diminishes accordingly. The average P thus starts decreasing, and eventually mirrors the behaviour of P 0 .
The preferential infection of individuals with high RTP leads to a collective process of selforganization. This result is demonstrated by the distribution royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 susceptible population, once the epidemic becomes extinct. Our simulations reveal that Hðp 0 i Þ depends on the value of ρ. Figure 5b shows the numerical estimation of this distribution for epidemics with an increasing number of waves. The selection of individuals with small values of p 0 i in detriment of those with high basal RTP is apparent. The profiles of Hðp 0 i Þ contrast sharply with the result for ω = ρ = 0. In this limit case, it can analytically be shown that Hðp 0 i Þ decays exponentially, as illustrated in figure 5c.

Mean-field model
The dynamical interplay of epidemiological variables and RTP that gives rise to the infection waves observed in the stochastic dynamics can be understood in terms of a simplified continuous-time mean-field model, in the form of a system of ordinary differential equations which extend the standard SIR mean-field model. These equations read _ sðtÞ ¼ ÀPðtÞsðtÞqðtÞ, _ qðtÞ ¼ PðtÞsðtÞqðtÞ À qðtÞ and _ PðtÞ ¼ ÀwPðtÞqðtÞ þ rPðtÞ½1 À qðtÞu½P 0 À PðtÞ, where dots indicate time derivatives. As before, s(t) and q(t) are here the fractions of susceptible and infected individuals, respectively, while P(t) represents the average RTP of the susceptible population.
In the first two equations, we recognize an SIR model with contagion rate P(t). The infection frequency of the standard SIR model, namely, a constant coefficient which would multiply the product s(t)q(t) on the right-hand side of the two first equations, has been absorbed by P(t). Hence, in contrast with the stochastic model, P(t) is here not necessarily limited to the interval (0, 1). Moreover, a rescaling of time allows us to fix the removal frequency, corresponding to the probability λ of the stochastic model, to unity. We stress that equations (4.1) are not a continuous-time limit of our stochastic evolution rules. They have been proposed on the basis of 'macroscopic' arguments similar to those invoked in the mean-field formulation of the standard SIR model, with the aim of capturing qualitatively the same underlying mechanisms as the stochastic dynamics. The first term on the right-hand side of the third of equations (4.1) stands for the decrease of P(t) weighted by the fraction of infected population. Instead of the random threshold which avoids that the individual propensities reach non-positive values in the stochastic model, in the mean-field version the decrease of the average RTP is proportional to P(t) itself, which ensures that P(t) remains always positive. In the second term, which represents the RTP growth, the Heaviside step function θ[P 0 − P(t)] describes the saturation of P(t) to an upper basal average value P 0 . The frequencies w and r respectively control the rates at which P(t) decreases and increases. They are the continuous-time counterparts to the parameters ω and ρ of the stochastic model.
In the mean-field model, the product R(t) = P(t)s(t) is the epidemiological effective reproduction number. In fact, it is clear from the second of equations (4.1) that the infected population grows (_ q . 0) or shrinks (_ q , 0) depending on whether the product P(t)s(t) is respectively greater or less than the critical value R c = 1. When R(t) = R c , we have _ q ¼ 0, and q(t) attains a maximum or a minimum. When P(t) < P 0 , likewise, there is a critical value of q(t), at which _ P ¼ 0. For q(t) greater or less than q c , P(t) respectively decreases or increases. This coaction between the fraction of the infected population and RTP is the reciprocal mechanism that induces the oscillatory behaviour of our system, as we explain in the following. The whole process is illustrated along a few infection waves in figure 6, where roman numerals indicate the key events in an infection cycle.
Suppose that, as in the stochastic model, the initial state of the population mostly consists of susceptible individuals, with a small infected fraction. We also assume that, initially, P(t) coincides with the RTP basal value, P(0) = P 0 . If the initial reproduction number R(0) = P 0 s(0) exceeds one, the infection spreads and q(t) begins growing. At the same time, if q(0) < q c , P(t) would tend to increase but, as it is already at its saturation value P 0 , it initially remains constant. However, if the susceptible fraction is large enough, q(t) will eventually reach and overcome q c , at which point P(t) begins to decrease. Since s(t) is steadily decreasing as well, the effective reproduction number R(t) progressively approaches R c from above. At the moment when R(t) drops below R c , the infected fraction attains a maximum and begins to shrink, while P(t) and s(t) keep decreasing. This continues until q(t) attains q c again, now from above, and consequently P(t) reaches a minimum and begins to grow. If the royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 susceptible fraction is still large enough, R(t) can overcome R c again. The infected fraction begins growing and the cycle restarts.
This explanation emphasizes the fact that the susceptible population is the 'fuel' that keeps the infection cycles going. Indeed, a sufficiently large number of susceptible individuals is necessary to make both the infected fraction and the effective reproduction number overcome their respective critical values q c and R c . Oscillations terminate when the susceptible fraction drops below a level such that either _ q or _ P, or both, cannot change their sign anymore. From then on, the infection proceeds monotonically towards extinction.
Although the final (asymptotic) effective reproduction number in the mean-field model is generally not equal to one, successive oscillations of decreasing amplitude around its critical value make R(t) become progressively tuned near R c . This is shown in figure 6 for just three infection waves but, as the number of oscillations increases, the final value of R(t) is expected to progressively approach one. This behaviour is in close agreement with our results for the stochastic model, as illustrated by figure 4a.

Discussion
The temporal variation of the effective reproduction number of an epidemic emerges from the interplay of multiple factors affecting the spread of the disease, such as the progressive depletion of susceptible individuals, seasonal changes in transmissibility, imposed modifications in the contact patterns among  . Vertical dashed lines with Roman numerals indicate, across the second infection wave, the events that control the occurrence of oscillations. I: q(t) crosses the critical value q c upwards, and P(t) begins to decrease from P 0 ; II: R(t) crosses R c = 1 downwards, and q(t) attains a maximum; III: q(t) crosses q c downwards, and P(t) attains a minimum; and IV: R(t) crosses R c upwards, and q(t) attains a minimum. In this particular example, after the third infection wave (t ≈ 75), the susceptible fraction is not large enough to allow R(t) to cross R c again, and the infection proceeds to extinction without further oscillations.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 individuals, and non-pharmaceutical control measures, among others [41]. The added effect of unsupervised responses to the perception of epidemiological risk can also play a main role in restraining epidemic progression. However, the quantitative relevance of the mechanisms that underlie individual RTP is not yet fully understood: it is believed that emotions may drive risk perceptions, sometimes more than factual information [7], and that behavioural responses to pandemics are primarily shaped by risk attitudes, and not so much by actual incidence or mortality [42]. Behavioural sciences have much to add to epidemiology, not only to improve public communication and to understand social reactions to information campaigns [7], but also to sort out the main determinants of risky behaviours. Regardless of the fundamental drivers behind individual responses to perceived risk, our results specifically hint at the possibility that such responses tune the effective reproduction number around its critical value R = 1. In our model, single infection waves are characterized by a unique transition from R > 1 at the beginning of the epidemic to R < 1 at later stages, as observed in many epidemic outbreaks [43]. However, as the number of infection waves increases, the tuning to the critical value becomes more effective. The number of waves increases when the relaxation of the RTP towards its basal (highest) value becomes faster, while the average incidence decreases. Independently of the final outcome of the epidemics, lower incidence levels should permit a more efficient management during its course, for instance, by avoiding saturation of the health system.
The emerging stabilization of R around its critical value is in good agreement with empirical observations of the progression of COVID-19 [44,45] (see some examples in figure 7). Despite local idiosyncrasies, such as the different timing at which institutional control measures are applied and lifted (or even in the absence of such measures), this stabilization holds broad and wide after a few epidemic waves. We hypothesize that it is the unsupervised reaction of individuals to the current epidemic state, after a short learning period and in a way analogous to the inhibition-relaxation response implemented in our model, that overall compensates for other mechanisms affecting epidemic spread.
It could be further argued that it is at the point when R ≈ 1 that non-pharmaceutical measures have to be enforced if the aim is to fully inhibit the epidemic, given that the population may spontaneously enter a relaxation period, thus halting the decrease of R. Actually, both self-organized responses and imposed inhibitory measures are typically relaxed as incidence waves are in their decreasing phase. Our model shows that these mutually compensating mechanisms (i.e. the decrease of incidence and the concomitant relaxation of protective measures, and vice versa) drive the epidemic to an intermediate royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 9: 211667 stage of low incidence, that will eventually give rise to a new wave if the fraction of remaining susceptible individuals is high enough. This result is in agreement with other models that identify infection waves as fragile states of transient collective immunity [25] that degrade as soon as incidence decreases and individuals recover their basal risk exposure levels. Epidemic models have a limited predictive ability due to the impossibility of implementing all nuances of the real world, but also to incomplete data that enhances their sensitivity to parameters [9]. Despite these issues, models do have an important explanatory ability, by clarifying the individual effect of each involved mechanism. For example, the simple addition of heterogeneity in the individual susceptibility to infection quantitatively changes the herd immunity threshold in the unique wave of the classical SIR model [46]. In this sense, our model has to be understood as a proofof-concept: it shows that the collective response of a population with heterogeneous and time-varying risk propensities can reproduce sustained infection waves and the empirical trend towards R = 1. Several variations of the model might be worth exploring, including the possible effect of local versus global information, the dependence of individual responses on different epidemic cues, or the role played by other sources of heterogeneity, as in the number of contacts. In this respect, an important open question is whether different models that point at human behaviour as the main driver of epidemic dynamics could be synthesized into a small number of generic mechanisms, and if emerging epidemic properties, independent of specific details, do actually exist. At present, such dynamical properties have been ascribed to heterogeneity in individual activity [12], to a delayed, awarenessdriven population response [11], or to changes in momentary RTP, as here suggested, among several other proposals that, however, demand an external regulation of response strategies.
Epidemics have profoundly changed human societies. How they affect social habits, and thus longterm cultural features, is well documented [47]. On the evolutionary time scale, epidemics should also impact basal RTP, considering that risk aversion emerges by natural selection if reproductive risk is correlated across individuals in a given generation [48]-a situation brought about by epidemics. The model introduced here may serve as a first quantitative approach to establish a relationship between epidemic characteristics and the strength of selection of risk-aversion profiles.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Competing interests. We declare we have no competing interests.