Memory-induced complex contagion in epidemic spreading

Albeit epidemic models have evolved into powerful predictive tools, most assume memoryless agents and independent transmission channels. We develop an infection mechanism that is endowed with memory of past exposures and simultaneously incorporates the joint effect of multiple infectious sources. Analytic equations and simulations of the susceptible-infected-susceptible model in unstructured substrates reveal the emergence of an additional phase that separates the usual healthy and endemic ones. This intermediate phase shows fundamentally distinct characteristics, and the system exhibits either excitability or an exotic variant of bistability. Moreover, the transition to endemicity presents hybrid aspects. These features are the product of an intricate balance between two memory modes and indicate that non-Markovian effects significantly alter the properties of spreading processes.

Albeit epidemic models have evolved into powerful predictive tools, most assume memoryless agents and independent transmission channels. We develop an infection mechanism that is endowed with memory of past exposures and simultaneously incorporates the joint effect of multiple infectious sources. Analytic equations and simulations of the susceptible-infected-susceptible model in unstructured substrates reveal the emergence of an additional phase that separates the usual healthy and endemic ones. This intermediate phase shows fundamentally distinct characteristics, and the system exhibits either excitability or an exotic variant of bistability. Moreover, the transition to endemicity presents hybrid aspects. These features are the product of an intricate balance between two memory modes and indicate that non-Markovian effects significantly alter the properties of spreading processes.

I. INTRODUCTION
Epidemic modeling has proven to be a powerful tool for the study of contagion phenomena in biological, social, and technological systems [1][2][3]. Variations of the benchmark susceptible-infected-susceptible (SIS) and susceptible-infected-recovered (SIR) models have provided valuable insights into the nature of spreading mechanisms, the dynamics of outbreaks, and the viability of containment protocols [4][5][6]. The inclusion of real-life contact and mobility patterns has yielded astonishingly accurate results, prompting the use of epidemic models as real-time predictive tools [7][8][9].
The canonical modeling scheme for contact-based contagion [1,6] assumes Markov processes and isolated transmissions. Markov processes are memoryless, which translates into exponentially distributed interevent times and enhances the mathematical tractability. The inappropriateness of this approximation, however, is widely supported by empirical evidence, such as the peaked distributions of infection periods of numerous diseases [10,11] or the bursty human activity patterns in social networks, well described by heavy-tailed distributions [12,13]. On the other hand, assuming isolated transmissions leads to infection channels that are not influenced by their local environment. Consequently, the infection likelihood can be written as the sum of statistically independent exposures. Nevertheless, there is evidence supporting the existence of more complex, nondyadic mechanisms, e.g., in fungal and bacterial pathogen colonization [14,15], and social contagion [16][17][18].
In recent years, an important amount of research has focused on overcoming these modeling limitations. Regarding memory, a wide array of modifications has been analyzed, such as two-step infection models [19,20], nonexponential distributions [21][22][23] and time-varying transmission probabilities [24,25]. Conversely, a plethora of * xrhoffmann@gmail.com † marian.boguna@ub.edu complex contagion schemes has been proposed to mediate the assumption of independent transmissions. Examples include correlated, nonlinear transmission channels [26,27], extended neighborhood effects [28,29], and deterministic threshold models [30,31]. So far, not much research has focused on tackling both assumptions simultaneously, and little is know about how these two features interact. Such a combined approach is of particular interest for contagion phenomena that include a social component, such as awareness and vaccination campaigns [32,33], or the spread of noncommunicable diseases (e.g., obesity, anxiety, and substance abuse) [34,35].
In this work, we develop an infection mechanism that is equipped with memory of past exposures to multiple infectious sources. A notion of social reinforcement/inhibition arises organically, and the concepts of non-Markovian dynamics and complex contagion become intrinsically coupled. We obtain analytic results for the SIS model and perform extensive stochastic simulations in random degree-regular networks. Our analysis reveals a sophisticated interplay between two memory modes, displayed by a collective memory loss and the dislocation of the critical point into two phase transitions. An intermediate region emerges where the system is either excitable or bistable, exhibiting fundamentally distinct behaviors compared to the typical healthy and endemic phases. Additionally, the transition to the endemic phase becomes hybrid, showing both continuous and discontinuous properties.

II. MICCSIS MODEL
Epidemic-like models are employed for a variety of dynamics, such as opinion formation, rumor spreading, and innovation adoption. Although we use the original disease-specific terminology throughout this work, the scope of our analysis extends to all of these fields.
The memory-induced complex contagion susceptibleinfected-susceptible (miccSIS) model describes a population of agents that can be either susceptible (healthy) or infected (infectious), and is embedded on an undirected, unweighted network. The contact network is encoded in the adjacency matrix, which is nonspatial (it carries no information about the agents' physical position) and static (it remains fixed over time).
Infected nodes have a constant infectivity rate, υ, and continuously spread doses of contagion towards their entire neighborhood. They target all of their neighbors equally, transmitting pathogen along each edge at constant rate υ. Susceptible nodes collect these toxins from all their neighbors, amassing a total viral load κ, and transition to the infected state with probability ψ * inf (κ)dκ, where ψ * inf (κ) is the infection probability density. Infected nodes are unaffected by the toxins (their viral load does not increase) and recover spontaneously after a random time t, with interevent time distribution ψ rec (t). At recovery, their viral load is completely erased and they become susceptible once again.
As in the standard SIS model, susceptible nodes whose nearest neighborhood is completely healthy cannot become infected. Since no active processes are associated to their state, they are irrelevant for the immediate evolution of the system. However, these inactive nodes play a crucial role in the long-term dynamics of the miccSIS model, therefore we assign them to an additional compartment, which we call dormant. A dormant node transitions to susceptible as soon as one of its neighbors becomes infected. Conversely, when the last infected neighbor of a susceptible node recovers, the latter transitions to the dormant state. At this point, the viral load it had previously amassed starts to deteriorate with relaxation time ζ, modeling its long-term memory. This feature mimics the restoring of an individual's immune system, or the gradual loss of interest of an opinion, idea, or trend (see Appendix A for a schematic overview). Overall, the system's evolution is determined by a set of discrete, stochastic, statistically independent processes (an infection for each susceptible node and a recovery for each infected node), which enables the use of the generalized non-Markovian Gillespie algorithm [27]. A key ingredient of this algorithm is the instantaneous hazard rate of an active process, ω(t), obtained from its interevent time distribution, ψ(t), and corresponding survival probability, Ψ(t) = ∞ t ψ(t )dt , as ω(t) = ψ(t)/Ψ(t). In short, ω(t) measures the probability per unit of time that the corresponding event takes place between t and t + dt [36]. For a Poisson point process, the interevent time distribution is exponential and the hazard rate is, therefore, constant. In general, interevent time distributions decaying slower (respectively, faster) than exponential lead to asymptotically decreasing (increasing) hazard rates (see Appendix B for more details).
In general, the infectivity rate, υ, the relaxation time, ζ, the infection probability density, ψ * inf , and the recovery interevent time distribution, ψ rec , may vary from node to node. For example, one could model distinct age groups by segregating the population and assigning different values of the parameters to each subpopulation. Notwithstanding, in order to eliminate the effects of node hetero-geneities, in the present work we use the same υ, ζ, ψ * inf , and ψ rec for all nodes. For this same reason, we limit our analysis to random degree-regular networks, particularly with degree k = 4.
Infections are governed by the versatile Weibull distribution, with shape parameter α and scale parameter µ, For α > 1 it presents a peak, resembling a bell curve, α = 1 corresponds to a Poisson distribution, and for α < 1 it has power-law-like fat tails. The corresponding instantaneous hazard rate is with z(t) the number of infected neighbors at time t (see Appendix B for a detailed derivation). With α > 1 (respectively, α < 1), ω inf (t) increases (decreases) monotonically with κ(t). Notice that when α = 1 we recover the customary expression of the standard SIS model, ω inf (t) ∼ z(t). For α = 1, however, Eq. (2) cannot be written as a linear superposition of independent transmission channels. Hence, the agents' memory induces a complex contagion scheme, even though the model does not explicitly incorporate any social reinforcement/inhibition mechanisms (see Appendix C for a detailed discussion). To further isolate the effects of the modified infection mechanism, we treat recoveries as Poisson processes with rate η, i.e., with constant hazard rate ω rec (t) = η. We define the effective spreading ratio, λ, as the average time required to recover over the average viral load needed to become infected, nondimensionalized by the infectivity rate, λ = υ t rec / κ inf . With the selected distributions we find an expression for the scale parameter, µ = ηλ(αυ) −1 Γ(α −1 ), with Γ the gamma function. Notice that, once again, α = 1 recovers the customary expression of the standard SIS model, λ = υµ/η (with infectious rate υµ per infected neighbor). As for the relaxation time of dormant nodes' viral load, we consider the limit cases of instantaneous decay, ζ = 0, and perpetual accumulation, ζ = ∞.
For illustrative purposes, consider the system depicted in Fig. 1(a), where all nodes are initially healthy except for D. Node C becomes infected at time t 0 and subsequently infects B at t 1 . During the interval t ∈ [t 0 , t 1 ], node A's viral load, κ A , grows with rate υ, but from t 1 onwards it will increase with rate 2υ. At t 2 , node C recovers and κ A reduces its accumulation rate back to υ, and when B recovers at t 3 , κ A starts to decay with relaxation time ζ. Finally, C becomes infected once again at t 4 and κ A resumes its growth at rate υ. Figs. 1(b-e) show the evolution of κ A and ω A for various ζ and α.
Hereafter we use temporal units such that η = 1 and, without loss of generality, set υ = 1.

III. SHORT-TERM MEMORY
We begin our analysis for ζ = 0, with dormant nodes instantly erasing their viral load. When the outbreak reenters their neighborhood, they become susceptible starting afresh (with κ = 0). Hence, the only memory effect present is during the infection period, which we interpret as a short-term memory mode.
Each node is uniquely defined by its state, infected or healthy, and its accumulated viral load. The state of a node only changes with the transitions i) infected to healthy, and ii) susceptible to infected. When a node transitions between susceptible and dormant, its state remains unaltered. On the other hand, the viral load i) is erased instantly when an infected node recovers, ii) increases proportionally to the number of infected neighbors while a node is susceptible, and iii) is erased instantly when a susceptible node becomes dormant.
In the Supplemental Material (SM), we derive exact stochastic evolution equations both for the state and accumulated viral load of each node. Applying a mean-field approximation, these equations can be reduced to a single expression for the late-time prevalence, ρ, i.e., the average fraction of nodes that are infected in the steadystate (see SM for a detailed derivation). If all nodes have the same degree k, this equation reads with a > 0. Roughly speaking, the first term corresponds to the recovery of infected nodes, the second term to the infection of susceptible nodes, and the third is related to susceptible nodes becoming dormant. The shape factor of the infection probability, α, and the effective spreading ratio, λ, are encapsulated in a and b, which become constant parameters when ρ ≈ 0. Linear stability analysis reveals that the healthy phase, ρ = 0, becomes unstable at a − b − 1/k = 0, yielding a phase transition at the critical point a c = b + 1/k. Moreover, the nature of the transition changes at a − bk = 0, yielding a tricritical point at a tc = 1/(k − 1), b tc = 1/k(k − 1). Then the steady state is endemic for a > a c , and the phase transition is continuous for b < b tc and discontinuous for b > b tc . Finally, the prevalence of the endemic phase scales as ρ ∝ (a − a c ) β , with β c = 1 when the transition is continuous, and β tc = 1/2 at the tricritical point. Figure 2 shows the phase diagram in terms of the original parameters α and λ. We observe that the critical point initially increases monotonically with α, but afterwards saturates for α → ∞. This result is consistent with the possibly largest epidemic threshold reported in [37]. On the other hand, the transition to endemicity is discontinuous for α > α tc ≈ 2.348. Recall that the infection probability density presents a peak for α > 1, and, in fact, tends towards a delta function in the limit α → ∞. Thus, a node requires a quasi deterministic amount of viral load to become infected, mimicking a threshold model, which commonly exhibits a discontinuous phase transition [38].
In order to verify these analytic findings, we perform stochastic simulations. We explore the position of the critical point, λ c , that separates an absorbing (healthy) phase (λ < λ c ) from an active (endemic) one (λ > λ c ). The simulations start well into the active phase with a fully infected population, and quasistatically decrease the control parameter, λ, until finite-size fluctuations trap the system in the absorbing state. We sample the latetime prevalence, ρ st = lim t→∞ N I (t)/N , of 10 4 states, time-averaged over various trajectories (see SM for simulation details). As shown in Fig. 3(a), λ c indeed increases with α, and saturates for large values of α. Moreover, for α < 1 the approach to the critical point is very similar to the standard SIS (α = 1), consistent with a continuous phase transition with exponent β = 1. On the other hand, for α = 4 the curves terminate at a remarkably high prevalence, consistent with a discontinuous phase transition. Finally, the curves for α = 2 also terminate at a rather high prevalence, which deviates from the analytic prediction. Nonetheless, since the apparent discontinuity decreases with the system size, this first observation is most likely related to finite-size effects. Additionally, α = 2 is close to tricritical point, thus a cross-over effect towards the exponent β tc is expected, explaining the rather steep approach towards the critical point. Overall, these analytic and simulated results indicate that the system's macroscopic properties are drastically affected by the microscopic details of the infection mechanism.

IV. LONG-TERM MEMORY
Next, we consider the case ζ = ∞, where a dormant node's viral load remains frozen until the outbreak revisits its neighborhood. Besides the short-term memory present during the infection period, nodes now possess an additional long-term memory mode that is capable of connecting very distant temporal points, causing the system to evolve in a highly nonlinear manner.
As before, the state of a node changes with the transitions i) infected to healthy, and ii) susceptible to infected. On the other hand, the viral load i) is instantly erased when an infected node recovers, ii) increases proportionally to the number of infected neighbors while a node is susceptible, and iii) remains constant while the node is dormant. We can write an equation for the state of node i (n i = 1 if its infected, n i = 0 if its healthy), and obtain the expected value in the steady-state (see SM for a detailed derivation). The resulting equation reads with z i = j a ij n j the number of infected neighbors, and a ij the elements of the adjacency matrix. Note that this equation is derived without implementing a mean-field approximation. Surprisingly, Eq. (4) is identical to the first-order equation of the standard SIS model [6] and, therefore, both the miccSIS and SIS models have the same mean field description. Thus, whenever the standard SIS dynamics is well described by its mean field approximation, as is the case for random degree-regular networks, the micc-SIS dynamics is also well described by the same mean field approximation. This equivalence demonstrates that the late-time prevalence is independent of α, and equivalent to the Markovian model. This result is verified by stochastic simulations, shown in Fig. 3(b). Indeed we observe that the late-time prevalence curves coincide for all values of α. A small deviation occurs in the critical region, caused by finite size fluctuations.
Compared to the short-term memory mode, for α > 1 (respectively, α < 1) the endemic phase is enlarged (shrunken) by the long-term mode. This phenomenon is explained by the monotonically increasing (decreasing) infection rate. When the outbreak revisits a dormant node's neighborhood, its previously accumulated viral load facilitates (hinders) reinfection, enabling (preventing) the outbreak to remain active in a wider range of λ. These results reveal that the additional long-term memory completely suppresses the effects of the short-term mode. Specifically, it causes individuals with virtually infinite memory to behave, on the aggregate, as if they had no memory at all. This collective memory loss consequently renders the system's macroscopic state unable to distinguish between agents' microscopic properties.
In order to elucidate these findings, we proceed with the analysis of patient zero scenarios: the arrival of an infected agent in a previously unaffected population. We employ the lifespan method [39], which simulates outbreaks starting from a single infected node. Each singleseed realization is characterized by its coverage, K, defined as the number of distinct nodes that have become infected at least once, and can be either finite or endemic. While the former return to the absorbing state, the latter evolve towards an active steady state. In finite systems, we introduce a coverage threshold, K th = c th N , with 0 < c th < 1. A realization is declared endemic whenever its coverage reaches the threshold, and those that terminate without surpassing it are considered finite. As reported in [39], the value of c th does not modify the qualitative results. Hereafter, we use c th = 0.75 For a fixed value of λ we run 10 4 realizations, each starting with a single, randomly chosen infected node, and a system cleared of all viral load. We measure the average coverage fraction,c = K /N , and the probability that a realization surpasses the coverage threshold, P 1 , which serves as a proxy for the true endemic probability, P ∞ , the probability, in the thermodynamic limit, that an outbreak becomes endemic (see SM for simulation details).
The results for peaked infection distributions (α > 1) are shown in Figs. 4(a,b), which include the previously computed ρ st , and the late-time prevalence of single-seed outbreaks that are able to become endemic, ρ * st (see SM for simulation details). We find that the average coverage,c, and the endemic probability, P 1 , coincide for all values of λ and present a continuous phase transition at λ c (c). However, this point is notably larger than the critical point of the late-time prevalence, λ c (ρ st ). While ρ st presents a continuous phase transition, ρ * st exhibits a discontinuous phase transition at λ c (ρ * st ) = λ c (c). As expected, after the abrupt jump the two prevalence curves overlap.
This evidences the existence of an intermediate region where all single-seed outbreaks return to the absorbing state, whereas fully infected populations evolve towards an active steady state. The key ingredient to understand this phenomenon is the environment of frozen viral load. During the simulations that measure the late-time prevalence, the viral loads are well thermalized, enabling the outbreak to remain in an active state. Conversely, this environment is deficient in single-seed outbreaks, as the system has not yet reached its steady state. Hence, outbreaks are unable to produce sufficient new infections and rapidly become trapped in the absorbing state.
These results indicate that the system displays two attractors in this intermediate region. Then, for ζ = ∞ and α > 1 the system's phase diagram exhibits an additional bistable phase, that separates the usual healthy and endemic phases. The associated hysteresis loop, however, has a rather exotic nature: although its lower branch presents the expected discontinuity, the upper branch connects the two attractors in a continuous manner. This contrasts with previous findings of bistability, were the hysteresis loop is bounded by two discontinuities [24,29,40]. Moreover, the transition to full endemicity is hybrid [41]: the endemic probability grows con- tinuously at λ c (ρ * st ), but the late-time prevalence jumps discontinuously.
Finally, in Figs. 5(a,b) we show the patient zero analysis for the standard SIS (α = 1) and broad-tailed infection distributions (α < 1). Here, we additionally compute P 3 , the probability that a single-seed outbreak reaches the coverage threshold three times (see SM for simulation details). With α = 1, P 1 and P 3 are practically identical, indicating that an outbreak that surpasses the coverage threshold once remains active long enough to surpass the threshold two more times. Thus P 1 is an adequate proxy for the true endemic probability, P ∞ , which additionally coincides withc and ρ st .
For α < 1 the situation is quite different. Firstly, the average coverage starts growing continuously at λ c (c), when all other order parameters are still identically zero. Additionally, the transition point of P 1 is significantly lower than that of P 3 . Thus, there is a wide interval where all outbreaks that surpass the threshold once eventually terminate in the absorbing state, evidencing the inadequateness of P 1 as a measure of the true endemic probability. The inflection point of P 3 is much closer to the transition point of ρ st , which suggests that the critical point of the endemic probability (P ∞ , the probability to surpass the threshold an infinite amount of times) coincides with λ c (ρ st ). Beyond this point, P ∞ is expected to coincide withc, indicating that the endemic probability presents a discontinuous phase transition. Then, the transition to full endemicity is again hybrid. Nonethe-less, here the late-time prevalence grows continuously, while the endemic probability jumps discontinuously.
In this case, we find an intermediate region where outbreaks are unable to become endemic (P ∞ = 0) but affect a macroscopic fraction of the population (c > 0). In Figs. 5(c,d) we show the prevalence,ρ(t) = N I (t) /N , of realizations that surpass the coverage threshold once, for λ = 0.33 and λ = 0.4. We include the results for ζ = 0 for comparison (see SM for simulation details). Both values of λ are located in the active phase of the short-term memory mode (ζ = 0), and so the endemic realizations converge monotonically towards their active steady states. For the long-term memory mode (ζ = ∞), λ = 0.33 is located in the intermediate region. In this case, we observe that outbreaks grow up to a maximum, after which their prevalence gradually diminishes until they reach the absorbing state. This behavior is typically observed in SIR-like dynamics and is reminiscent of excitable media. In the endemic phase (λ = 0.4) the outbreaks continue presenting a peak, but afterwards relax towards an active steady state.
In conclusion, for ζ = ∞ and α < 1 the usual healthy and endemic phases are separated by an additional excitable phase. This excitable behavior is again a consequence of the environment of frozen viral load. Independently of ζ, an outbreak starts from a single infected node in a population cleared of viral load. Then it initially evolves as if the agents only had the short-term memory mode (clearly appreciable for t ∈ [0, 25] in Fig. 5(c)), rapidly achieving a large coverage. When the outbreak revisits a previously affected area, the long-term memory mode is activated and the frozen viral load impedes new infections. Thus, dormant nodes are effectively removed from the dynamic, impede the outbreak to grow, and eventually cause its extinction. To the extend of our knowledge, this excitable behavior has not been previously reported in comparable SIS-like models.

V. CONCLUSIONS
All these results show a crucial feature of agents that possess the long-term memory mode. Focusing only on the late-time prevalence of fully infected populations provides little insight about the system's constituents. Nevertheless, widely distinct and clearly distinctive behaviors appear with the analysis of patient zero scenarios. Moreover, a common effect of agents' memory is the breaking of the symmetry between the order parametersc, P , and ρ st . If agents are memoryless, all three order parameters are completely identical. This symmetry is broken when agents possess a long-term memory mode, and the critical points become dissociated. The system first transitions from the healthy phase to an either bistable or excitable intermediate regime, followed by a hybrid transition to the endemic phase. This differs from a double phase transition, where the same order parameter undergoes two consecutive phase transitions, a phenomenon usually associated to node and/or topological heterogeneities [24,40,42,43].
The analysis of our stylized, yet feature-rich model evidences a crucial role of non-Markovianity in the spread of epidemic outbreaks. In particular, the agentss memory range dramatically impacts the outbreak of newly introduced pathogens, currently an active field of epidemic modeling. Nonetheless, further research is necessary. A thorough study of finite, nonvanishing relaxation times of the viral load of dormant nodes can aid in further elucidating the interplay between memory modes, and its effects on the systems properties. Furthermore, the inclusion of nontrivial contact networks will supply renewed insight on the relevance of microscopic mechanisms and topological properties in dynamical processes on networks. There are two types of active processes which entail one or possibly more transitions (summarized in Fig. 6).
• Infection of susceptible agent j. Agent j transitions from susceptible to infected. Additionally, all of j's neighbors that were dormant transition to susceptible (and resume their accumulation of viral load).
• Recovery of infected agent j. If all of j's neighbors are healthy, j transitions from infected to dormant. If at least one of j's neighbors is infected, j transitions from infected to susceptible. Additionally, all of j's neighbors that were susceptible and had only one infected neighbor (i.e., agent j) transition to dormant (and their viral load starts to decay). Here, we summarize the derivation reported in [27]. Consider a set of M statistically independent, discrete, stochastic processes, each with an interevent time distribution ψ j (t) and corresponding survival time Ψ j (t) = ∞ t ψ j (t )dt . At a certain moment in time t 0 , process j has been active for t j units of time. Let φ(τ, i|{t k })dτ denote the joint probability that the next-occurring event takes place in the interval t ∈ [t 0 + τ, t 0 + τ + dτ ] and corresponds to process i, conditioned by the set of elapsed times {t k }. This probability density can be expressed as where is the survival probability of τ , i.e., the conditional probability that no event takes place before t 0 + τ . Then the probability that the next event takes place in the interval Once the interval τ is known, the probability that the next-occurring event corresponds to process i is given by with ω j (t) = ψ j (t)/Ψ j (t) the instantaneous hazard rate of process j. Eqs. (B3) and (B4) provide an algorithm that generates statistically correct sequences of events: i) draw the interval by solving Ξ(τ |{t k }) = u, with u ∈ U (0, 1), ii) increase the system time as t ← t + τ , iii) draw the process from the discrete distribution Π(i|τ, {t k }), iv) revise the list of active processes, and v) update the set of elapsed times as t j ← t j + τ (setting t j = 0 for newly activated processes).

Incorporating viral loads
Recoveries are straightforwardly incorporated into this framework, with the elapsed time t j measuring the period since agent j became infected (i.e., this occurred at t = t 0 − t j ). On the other hand, infection processes require the translation of infection probability densities into interevent time distributions. Consider susceptible agent j, characterized by its infection probability density ψ * j (κ) and corresponding survival probability Ψ * j (κ) = ∞ κ ψ * j (κ )dκ . Its interevent time distribution, ψ j (t), is given by the normalization condition Since the activity in j's neighborhood may vary over time, the rate at which it amasses viral load is generally nonconstant. At time t it has amassed κ j (t) units of viral load and has z j (t) infected neighbors. If the system remains unaltered in an interval dt, node j will amass an additional dκ =υ j dt, withυ j (t) = zj (t) i=1 υ i its instantaneous amassment rate. Substituting in Eq. (B5) we find For the survival probability we have which yields its instantaneous hazard rate .

(B8)
Note that we can always write t = t 0 + τ , with t 0 the time at which the system was last updated, and τ ≥ 0. Then, the instantaneous amassment rate remains constant in the interval [t 0 , t],υ j (t) =υ j (t 0 ), and κ j (t) = κ j (t 0 ) + τυ j (t 0 ). In our work, infections are governed by a Weibull distribution with shape parameter α and scale parameter µ, and recoveries by a Poisson process with rate η, Since all infectors have the same infectivity rate υ, the instantaneous amassment rate of a susceptible node j becomesυ j (t) = υz j (t), with z j (t) the number of its neighbors that are infected at time t. So the instantaneous hazard rates for infections and recoveries are, respectively, ω inf (t) = υαµ α z(t)[κ(t)] α−1 and ω rec (t) = η.
Appendix C: Simple and complex contagion

Simple contagion
Simple contagion describes purely dyadic interactions, thus we can identify each edge that connects a healthy node with an infected one as an isolated transmission channel. Consider at time t a susceptible node j and its infected neighbor i, which became infected at t i < t. The probability that node i infects node j within the interval (t, t + dt) is ω i→j (t|t i )dt, regardless of the rest of the system. If node j has z j (t) infectors at time t, the previous statement holds for each one of them. The total probability that node j becomes infected at time t depends on all of its incoming transmission channels. Since these are statistically independent, we can write where ω j (t) is the instantaneous hazard rate of node j's infection process (i.e., the probability per unit of time that node j becomes infected at time t). Using we can write Eq. (C1) as thus the total hazard rate is proportional to the number of current infectors. If ω i→j (t|t i ) = β are constants (and homogenous for all pairs of nodes), we recover the standard SIS model with the familiar expression ω i (t) = βz i (t). When the transmission rates ω i→j are time-dependent, the dynamics has memory effects; thus, simple contagion can be non-Markovian (as in [23], for example).

Complex contagion
When the dynamics are described by interactions that are not strictly dyadic, the contagion becomes complex. These processes usually incorporate an explicit social reinforcement or inhibition mechanism. Although the classification of complex contagion processes is yet to be formalized, they can be broadly categorized into two groups: • In edge-centric approaches, one still considers the transmission channel from infected node i to susceptible node j. Now, however, the transmission rate ω i→j is affected by the neighborhoods of i and/or j (for specific examples, see [28,29,44]). Considering only nearest-neighbors, the transmission rate from node i to node j at time t, ω i→j (t|z i (t), z j (t)), is a function of their current infected neighbors, z i (t), z j (t). Although we can still write the total hazard rate ω j (t) as in Eq. (C1), the instantaneous average defined in Eq. (C2) has an explicit dependence on z i (t), z j (t). Consequently, ω j (t) can be superlinear (reinforcement) or sublinear (inhibition) with the number of current infectors, z j (t).
• On the other hand, node-centric approaches forgo the notion of transmission channels and directly prescribe the instantaneous hazard rate, ω j (t). These usually incorporate thresholds, such as ω j (t) = δ(T j − z j (t)) [30] or ω j (t) = z j (t)Θ(T j − z j (t)) + βΘ(z j (t) − T j ) [45], which explicitly evidence the nonlinearity of ω j (t) with z j (t).

Memory-induced complex contagion
Consider an isolated pair of nodes i and j in the micc-SIS model. Both are healthy when node i becomes infected at time t i . The total amount of viral load that j has amassed at time t > t i is κ j (t) = υ(t − t i ). The instantaneous hazard rate of node j's infection is but since j has only one infected neighbor, it is given solely by the exposure to node i: ω j (t) = ω i→j (t|t i ), with Now consider an infected node j that at time t 0 recovers and becomes dormant (all its neighbors are healthy). At time t it has a set of {i} infected neighbors that became infected at times {t i }, with t 0 < t i < t, ∀i = 1, 2, ..., z j (t). The total amount of viral load that j has amassed at time t is where χ(t) stores the viral load accumulated from neighbors that became infected in the interval (t 0 , t) but are currently healthy. Substituing Eq. (C6) in Eq. (C4), we find which, using Eq. (C5), can be written for α = 1 as Notice that Eq. (C8) cannot be written as Eq. (C1). Therefore, while the exposures to infectious sources are initially described as isolated events, the agents' memory causes them to become entangled. For instance, for α = 2, Eq. (C8) can be written as which has an explicit quadratic dependence on z j (t) [46]. The simple contagion of the standard SIS model can be recovered only with α = 1, for which Eq (C4) equates with Eq (C3). In conclusion, the non-Markovianity of the miccSIS model induces an effective social reinforcement/inhibition even though it was not incorporated in the initial description of the model. At time t, node i is described by its state n i (t) and its viral load κ i (t). The former is a discrete variable that can take two values, n i (t) = 1 if its infected, and n i (t) = 0 if its healthy (susceptible or dormant). The viral load, on the other hand, is a continuous variable with κ i (t) ≥ 0, ∀t. A relevant nonindependent variable is the number of i's neighbors that are infected at time t, z i (t) = j a ij n j (t), with a ij the elements of the adjacency matrix. In particular, this variable aids in distinguishing healthy susceptible nodes, (1 − n i (t))(1 − δ 0 zi(t) ), from healthy dormant nodes, (1 − n i (t))δ 0 zi(t) , with δ m the Krnecker function (δ m = 1 if m = , δ m = 0 if m = ). The evolution of these variables are governed by a series of microscopic, dichotomous, stochastic processes: • Infected node i recovers, ξ i = 1, or remains infected, ξ i = 0, given by the instantaneous hazard rate of node i's recovery, η. At O(dt), the corresponding probabilities are (1) • Susceptible node i becomes infected, π i = 1, or remains susceptible, π i = 0, given by the instantaneous hazard rate of node i's infection, , the corresponding probabilities are • Susceptible node i becomes dormant, χ i = 1, or remains susceptible, χ i = 0. This transition occurs if all of node i's infected neighbors recover. At O(dt), this reduces to node i having a single infected neighbor that recovers, thus The state of node i only changes with the transitions infected to healthy and susceptible to infected. When node i transitions between susceptible and dormant, the state remains unaltered. Then the equation for the state of node i at time t + dt is where the first term corresponds the node i being infected and not recovering, while the second term corresponds to node i being susceptible and becoming infected. While node i is susceptible, its viral load increases proportionally to the number of infected neighbors. Moreover, when node i transitions from infected to healthy, its viral load is erased instantly. Additionally, with short-term memory (ζ = 0), node i's viral load is erased instantly when it transitions from susceptible to dormant. Thus the equation for the viral load of node i at time t + dt is The first term corresponds to the previously amassed viral load, the second term describes the event where infected node i recovers and erases its viral load, the third term corresponds to susceptible node i remaining susceptible (neither recovering nor becoming dormant) and accumulating additional viral load from its z i (t) infected neighbors (from each at rate υ), and the fourth term describes the event where susceptible node i becomes dormant and instantly erases its viral load.
In order to obtain the dynamic equations, we first compute the expectation value conditioned on time t, which only affects the stochastic variables Note that the delta term cancels since δ 0 zi(t) z i (t) = 0. Taking the ensemble average we find from where we compute Following the same procedure for the viral load, we obtain and, in addition, we compute the dynamic equation for κ γ i (t), for an arbitrary value of γ > 0, See section I D for a detailed derivation of Eqs. (11) and (12).

B. Mean-field approximation
Assuming that the state of the nodes are uncorrelated, we can write with Pr(n i = 0) = 1 − n i and Additionally, for uncorrelated networks we can write the topology term as In random degree-regular networks we have k i = k j = k = k, and after applying the mean-field approximation, For ρ ≈ 0 it is reasonable to assume that the coefficients A and B are constant. For simplicity we use the reduced coefficients a = µ α υη −1 A and b = µ α B (see section I E for details on recasting a and b in terms of the original parameters). The dynamics of the system is then encapsulated by the function and its first and second derivatives, The fixed points, ρ * , are given by f (ρ * ) = 0. Linear stability analysis reveals that these are stable when f (ρ * ) < 0 and unstable when f (ρ * ) > 0. Moreover, the transition is continuous if f (ρ * ) < 0 and discontinuous if f (ρ * ) > 0.

C. Phase diagram
From Eq. (25) we find that the healthy phase ρ * = 0 is always a fixed point. It is stable for b > a − 1/k and unstable for b < a − 1/k. The nature of the transition changes at b = a/k, and the intersection with b = a − 1/k yields a tricritical point located at a tc = 1/(k − 1), b tc = 1/k(k − 1). Then, the phase transition is continuous for b < b tc and discontinuous for b > b tc . The endemic phase is found by solving Near the critical point b = a c − 1/k, ρ ≈ 0 and we can expand and the prevalence scales as ρ ∝ (a − a c ) βc with β c = 1, the customary mean-field exponent [1].
and the prevalence scales as ρ ∝ (a − a c ) βtc with β tc = 1.
and substituting in Eq. (39) yields Then we take the ensemble average up to O(dt) and finally we find

E. Recasting into original parameters
In order to obtain the coefficient A = κ α i |n i = 1 / κ i |n i = 1 , we need to compute κ γ i |n i = 1 for γ = 1 and γ = α. These moments are conditioned on n i = 1, i.e., node i being infected. Since the viral load does not change while infected, measuring κ i of infected node i yields the same results as measuring κ i at the moment i became infected. Within the mean-field approximation, these quantities are the same for all nodes, hence we drop the i index. We denote the required density by φ(κ), i.e., the probability that a node had amassed κ viral load at the moment it became infected.
The difference between the infection probability, ψ * (κ), and φ(κ) is subtle but crucial in the case of short-term memory. Recall that with ζ = 0, a susceptible node instantly erases its viral load when it becomes dormant. Thus, reaching a viral load of κ is conditioned on being continuously exposed to the pathogen. Simply put, ψ * (κ) measures the probability of becoming infected when the accumulated viral load is κ, while φ(κ) measures the probability of reaching an accumulated viral load of κ and then becoming infected.
When ρ ≈ 0 we can assume that susceptible node i has only one infected neighbor j. Moreover, i transitioned from dormant to susceptible at the same time that j became infected. Thus, node i is exposed to a single source of pathogen, and the time since j became infected, t, and the viral load accumulated by i, κ, are proportional, κ = υt. Then φ(κ) is the probability of i becoming infected with κ, ψ * inf (κ), times the probability that j does not recover before t, Ψ rec (t). Expressed in terms of κ this is with N = ∞ 0 ψ inf (κ)Ψ rec (κ)dκ. Defining the integral the corresponding moments are We proceed in a similar manner for the computation of the moments κ γ i |X i = 1 . Now we are sampling the state of a node that is susceptible and has only one infected neighbor. For ρ ≈ 0 we assume the same scenario as before: since becoming susceptible, node i has been exposed continuously to a single infected neighbor j. Node i's state becomes X i = 0 either when j recovers or when i becomes infected itself. This event occurs when i has accumulated κ viral load with probability density Then the probability density to sample node i in state X i = 1 with a viral load of κ is [2] with Defining the integrals the corresponding moments are Figure 2 of the main text is obtained as follows. For a given λ and α, we compute the moments κ γ |n = 1 and κ γ |X = 1 with η = υ = 1. The integrals are compute numerically using the Python package SciPy. With these results we compute the coefficients A and B, and from there the coefficients a and b. With k = 4, the critical point λ c is computed finding the root of −1 + ak − bk = 0. The sign of bk 2 − ak marks the nature of the transition.

II. ANALYTICS: LONG-TERM MEMORY
With ζ = ∞, susceptible node i that becomes dormant freezes its viral load, i.e., it does not reset κ i to zero. Then the equation for κ i (t + dt) reduces to where the first term corresponds to the previously amassed viral load, the second term describes the event where infected node i recovers and erases its viral load, and the third term corresponds to susceptible node i remaining susceptible (neither recovering nor becoming dormant) and accumulating additional viral load from its z i (t) infected neighbors (from each at rate υ). Following the same procedure as in the previous section, we find the dynamic equation and taking the late-time limit yields As before, we expand κ i n i = κ i |n i = 1 n i . The conditioned average κ i |n i = 1 measures the viral load of an infected node, which again equates to the viral load at infection. With ζ = ∞, nothing hinders the accumulation of viral load, i.e., nodes may freely accumulate any value of κ. Thus, the probability of having κ at the moment of infection is simply the probability of becoming infected with κ, i.e., ψ * (κ). Using this probability density, κ i |n i = 1 = κ inf , the value used in the definition of the effective spreading ratio, λ = υ t rec / κ inf . Substituting in Eq. (71) gives Note that this equation is identical to the result obtained for the standard SIS [1], which demonstrates that the prevalence is independent of α, and equivalent to the Markovian model.

A. Additional verification
The value of ζ does not directly affect the state of node i, thus the dynamic equation for n i (t) is the same as before In order to obtain the dynamic equation for κ γ i (t) we start from and employ Eq. (39) to compute the expectation value conditioned on time t. All the terms are identical as before, except for then Finally, computing the ensemble average and rearranging the equation yields Setting γ = α in Eq. (77), taking the late-time limit in Eqs. (73) and (77), and dropping the dependence with t gives Using the expansion κ α i n i = κ α i |n i = 1 n i , and combining both equations yields κ α i |n i = 1 = µ −α , which recovers the result of and validates the computation of κ i |n i = 1 using ψ * (κ) .

III. SIMULATION
A. Core algorithm Here we provide a schematic outline of the core simulation algorithm, for fixed values of α, µ, υ, η, λ, and ζ. At a given time t, the nodes are separated in three lists: infected (I), dormant (D), and susceptible (S). For dormants we store their accumulated viral load κ. For susceptibles we store κ and also the number of infected neighbors z. The contact network is encoded in an adjacency matrix or list.
4. Compute the discrete distribution Π k = ω k /Ω, and sample the next-occurring event.
(a) Compute its number of infected neighors, z k .
-If z k = 0, move node k to the dormant list with κ k = 0.
-If z k > 0, move node k to the susceptible list with κ k = 0 and store z k . (b) For all of k's neighbors, decrease the number of infected neighbors by one, z ← z − 1.
• Susceptible node k becomes infected.
(a) Move node k to the infected list.
(b) For all of k's neighbors, increase the number of infected neighbors by one, z ← z + 1.
-If z = 1, move node from the dormant list to the susceptible list and store z .

B. Late-time prevalence curve
Given the stochastic nature of the dynamics we must average over independent realizations in order to obtain a representative ρ st (λ) curve. Additionally, we want to control the spacing in the ρ st axis and handle the increasing correlation time as we approach the critical point. We employ a two-step simulation scheme: in the first step we elaborate a list of λ values that will be used in the second step to extensively sample ρ st .
For a given network of size N , we start at λ = λ 0 and infect all nodes. We evolve the system during 25 × M 0 events (with M 0 = N ), record the final value of ρ = N I /N and store the system's final state. We repeat this for R independent runs, starting each time at λ 0 and storing each final state separately. We compute the average ρ 0 (of the R measures) and write λ 0 and M 0 to the output file. Next we decrease the control parameter, λ i = λ i−1 − λ i . For each run we load the corresponding initial state from storage and iterate 25 × M i events. Then we record the final value of ρ and store the system's final state. After repeating this for the R runs, we compute ρ i and ∆ρ i = ρ i−1 − ρ i . If ∆ρ i > ∆ρ max , we interpolate the results (see Fig. 1), setting ∆λ i+1 = λ i ∆ρ max /∆ρ i , and reassigning λ i ← λ i + ∆λ i − ∆λ i+1 and ρ i ← ρ i−1 − ∆ρ max . In addition, we increase the event interval, M i+1 = M i , with > 1. Finally we write λ i and M i to the output file (note that λ i = λ 0 and M 1 = M 0 ). A run is deactivated when it reaches the absorbing state (N I = 0): we stop simulating its dynamics and it is no longer included in the computation of ρ . We keep decreasing the control parameter λ until all runs are deactivated, the point at which the simulation is halted.
Following this preparatory step we proceed with an extensive sampling of the control parameter. We infect all nodes and load λ and M from the input file. We thermalize the system during 20 × M events and afterwards measure ρ and ρ 2 of X states, each separated by a window of M events. We repeat this thermalization-sampling procedure for the next entry on the input list, using the last visited state as initial condition. The simulation is halted when the list is fully iterated or whenever the system reaches the absorbing state. We repeat the whole procedure for Y independent runs, using the same input file. The results are temporally averaged, i.e., each measure j is weighted by its residency time τ j : ρ = w −1 Z j=1 τ j ρ j and ρ 2 = w −1 Z j=1 τ j ρ 2 j , with w = Z j=1 τ j and Z the total number of samples (note that, since the input file may not be fully iterated, Z ≤ X × Y ). We estimate the order parameter as ρ st = ρ and compute its standard error as s(ρ st ) = ( ρ 2 − ρ )/Z. We use λ 0 = 1.2, ∆λ 0 = 0.05, ∆ρ max = 0.025, = 1.01, X = 500, and Y = 20. We use R = 40 and R = 20 for networks of N = 10 3 and N = 10 4 nodes, respectively (figure 3 of main text).

C. Single-seed simulations
For a given network of size N and a fixed value of λ we simulate Z independent runs. Each outbreak starts with a single, randomly chosen infected node. All other nodes are healthy, with zero viral load, and the coverage K is set to zero. During the evolution of the outbreak we keep track of all the nodes' first infection label. Whenever a node is infected for the first time, we change its label and increase the coverage in one unit, K ← K + 1. An outbreak is terminated when it reaches the absorbing state (finite realization) or when the coverage reaches the threshold, K th = c th N (endemic realization). We record the final values K and K 2 of all outbreaks, and count the number of endemic realizations, z (1) end . Afterwards we compute the averages K = Z −1 Z j=1 K j and K 2 = Z −1 Z j=1 K 2 j . We estimate the avarege coverage fraction asc = K /N and the endemic probability as P 1 = z (1) end /Z, with standard errors s(c) = N −1 ( K 2 − K 2 )/Z and s(P 1 ) = P 1 (1 − P 1 )/Z. To compute P 3 we proceed from the same initial setting. Now, however, when an outbreak reaches the coverage threshold we reset the coverage to zero and erase all first infection labels (regardless of wether the node is healthy or infected at that moment). Then we continue evolving the same outbreak, keeping track again of each of the nodes' "first infection", changing the labels and increasing the coverage accordingly. When the coverage threshold is reached for a second time, this reset is performed once again. We count the number of realizations that are able to reach the coverage a third time, z end , and compute P 3 = z end /Z and s(P 3 ) = P 3 (1 − P 3 )/Z. We use c th = 0.75 and Z = 10 4 (figures 4 and 5 of main text).

Late-time prevalence
To measure the late-time prevalence of endemic outbreaks, ρ * st , we only consider realizations that, following the procedure explained above, are able to reach the coverage threshold a total of m times. At this point the outbreak has had sufficient time to evolve towards its active steady-state. Afterwards we measure ρ and ρ 2 of X states, each separated by a window of N events. For a given λ we record a maximum of W states, running a maximum of Y outbreaks. The results are temporally averaged, i.e., each measure j is weighted by its residency time τ j : ρ = w −1 Z j=1 τ j ρ j and ρ 2 = w −1 Z j=1 τ j ρ 2 j , with w = Z j=1 τ j and Z the total number of samples (note that Z ≤ W ). We estimate the late-time prevalence as ρ * st = ρ and compute its standard error as s(ρ * st ) = ( ρ 2 − ρ )/Z. We use m = 10, X = 10 2 , W = 10 4 , and Y = 10 4 ( figure 4 of main text).

Temporal profile
To represent the evolution of single-seed outbreaks that reach the coverage threshold,ρ(t), we start from the usual initial setting, and store the prevalence ρ(t) at given times, with time interval ∆t. If the system becomes trapped in the absorbing state before reaching the threshold, we discard the trajectory and start again. If the system is able to surpass the coverage threshold, we continue tracking its evolution while the outbreaks remains active, up to t max . For a given network of size N and a fixed value of λ we average Z trajectories, at fixed values of t: ρ(t) = Z −1 Z j=1 ρ j (t) and ρ 2 (t) = Z −1 Z j=1 ρ 2 j (t). We estimate the prevalence asρ(t) = ρ(t) and compute its standard error as s(ρ(t)) = ( ρ 2 (t) − ρ(t) 2 )/Z. We use ∆t = 0.1, t max = 200, and Z = 100 (figure 5 of main text).