Effects of memory on spreading processes in non-Markovian temporal networks

Many biological, social and man-made systems are better described in terms of temporal networks, i.e. networks whose links are only present at certain points in time, rather than by static ones. In particular, it has been found that non-Markovianity is a necessary ingredient to capture the non-trivial temporal patterns of real-world networks. However, our understanding of how memory can affect the properties of dynamical processes taking place over temporal networks is still very limited, being especially constrained to the case of short-term memory. Here, by introducing a model for temporal networks in which we can precisely control the link density and the strength and length of memory for each link, we unveil the role played by memory on the dynamics of epidemic spreading processes. Surprisingly, we find that the average spreading time in our temporal networks is often non-monotonically dependent on the length of the memory, and that the optimal value of the memory length which maximizes the spreading time depends on the strength of the memory and on the density of links in the network. Through analytical arguments we then explore the effect that changing the number and length of network paths connecting any two nodes has on the value of optimal memory.


I. INTRODUCTION
When a system is composed of many individual entities and pairwise interactions between them, then it is natural to describe its underlying structure as a complex network.We then say that it is on this backbone that all relevant dynamical processes take place [1,2].Often in real world systems this underlying structure is in its self dynamic, and so it is better described in terms of networks in which links among a fixed set of nodes change over time [3][4][5][6].Examples of such temporal networks include human contacts, which vary as individuals move over space [7,8], online social interactions that take place at certain points in time [9], or functional brain networks where correlations among the different areas of the human brain fluctuate over time [10,11].Recently many of these, and similar, systems have been empirically investigated, and the main dynamical properties required of temporal networks in order for them to better describe reality have started to be uncovered [12].In particular, it has been found that non-Markovianity is necessary to capture the non-trivial temporal patterns of realworld networks [13][14][15][16], and can play an important role on processes occurring on temporal networks.It has been shown that non-Markovianity can affect the dynamics of random walks [17], the speed of information [15], and the way diseases spread in systems with non-exponential inter-event times [18][19][20][21][22][23].Also non-Markovianity turns out to be useful in the definition of flow-based communities [24].
While the presence of memory has been found to be important, the influence that the strength (intensity) and length (range or order) of this memory can have on dynamical processes are still poorly understood.To shed light on the role of memory on spreading processes on temporal networks, we here propose a model for timevarying networks where the dynamics of the links is non-Markovian and is generated by a discrete autoregressive process of tuneable order.The key feature of our model, which we call the Discrete Autoregressive Network model of order p, or DARN(p) model, is that it allows us to precisely control not only the graph density, but also the strength and length of memory of each link.By considering a standard susceptible-infected (SI) epidemic over networks generated by the proposed model, we study how the range of the memory affects the rate at which infection is spread across the network.The main result of our work is to show that memory can play either of two opposing roles: it can slow-down or speed-up the spreading depending on the features of the network.In particular, it turns out that that the average spreading time is often non-monotonically dependent on the memory length, and there is a given value of the memory length for which we obtain the maximal average time until the entire network is infected.The DARN(p) model as presented is analytically tractable enough to allow for a more in depth study of the influence of memory than would be practical through numerics alone.We are in fact able to predict through analytical arguments the position of this maximum for a range of values of the variables associated with the model.We then explore the effect that changing the number and length of network paths connecting two nodes has on the value for the memory length which maximises the average passage time of the infection.

II. MODELLING TEMPORAL NETWORKS WITH MEMORY
Generating temporal networks is conceptually simple and a great deal of work has been done in this area [3,25].One takes a set of nodes and defines some way for them to interact over time (be it discrete or continuous).Without the direct use of empirical data, this can be done in a number of different ways, each with their own advantages and disadvantages.For instance, much attention has been devoted to temporal networks generated by the interactions of individuals in agent based models [12,[26][27][28][29].While such models might intuitively reflect reality on some level, they are often difficult to work with, without relying entirely on simulations.In order to keep precise control over key aspects of a temporal network, such as the strength and length of memory, while maintaining analytical tractability, we propose a model for generating temporal networks by assigning each link its own independent stochastic process.Given a set of nodes N , with |N | = N , we assign to each possible pair i, j ∈ N a discrete time stochastic process for the element of the adjacency matrix X ij t such that X ij t ∈ {0, 1} ∀t.For our purposes, we take links to be undirected, and for any two different links, (i, j) and (k, l), the two random variables X ij t and X kl t are independent and identically distributed (i.i.d).Thus we can talk more generally about the single process X t without worrying about which link we are referring to.The most important ingredient of our model is that the process X t is in general not only non-Markovian, but has a precisely controllable amount of memory.This in practice means that the presence of link (i, j) at time t 2 can depend on the presence of the link at time t 1 for any t 2 > t 1 .In particular, X t is chosen as a special case of the Discrete Auto-Regressive Process of fixed order p, from now on referred to as DAR(p) [6,30,31], which allows us to control both for the strength of the memory and for its length.The principle here can be explained as follows.To determine the state of the link at time t, i.e. its presence (sampling of X t gives 1) or is absence (X t gives 0), first we decide, with probability q, whether to copy one of the previous link states, or to determine the presence of the link through a Bernoulli trial with probability y.When we draw a state from the past, this state can be chosen in any way, but as we shall see in the following, here for simplicity we pick uniformly from the last p steps of the time series.In terms of random variables this can be written as: where, for each t, Q t ∼ Bernoulli(q), Y t ∼ Bernoulli(y) and Z t is some random variable which picks integers in the range (1, ..., p).As mentioned this Z t could take any form, indeed a natural choice may be Z t ∼ exp(t), but for the sake of simplicity we here take Z t ∼ U nif orm (1, p).When q = 0 the link has no memory, while for q = 0 the process in Eq. ( 1) is clearly non-Markovian, however, since the memory is finite, in that we only consider p previous values, we can view a DAR(p) process as a pth order Markov chain with an enlarged space of states [32].We can then define the so-called "p-state" of link (i, j) at time t, by combining the state of the link at time t along with its previous p − 1 states as the vector The set of p-states for each of the links is sufficient to completely describe the dynamics of the network.In a network with N nodes generated by our model, one can show that the expected degree k of a randomly chosen node at any point in time is given purely as a function of y as k = y(N − 1) (see Appendix A).In summary, our model for temporal networks, which we name Discrete Autoregressive Network model of order p, or DARN(p) model, depends on three parameters, namely: y, q and p.The first parameter y controls the density of the network.The second, q, tunes the strength of the memory term in the process with respect to the memoryless term.The final parameter, p, controls the length of the memory, which can be thought of as the number of time steps before the autocorrelation function decays exponentially (see Appendix B for a discussion of the autocorrelation function, and Appendix C for the initialisation of the network).

III. SPREADING PROCESSES ON TEMPORAL NETWORKS
We have considered the simplest possible mechanism for propagating a disease, or some message: the SI model, which is a special case of the SIS model with a recovery rate of zero [23,33,34].We have adapted this for temporal networks by only allowing infection to pass between nodes at times when a link is present (see Appendix D).Let us define I t to be the set of infected nodes at a time t.In our simulations we start with |I 0 | = 1 and we study the dynamics of the epidemics for different values of the three parameters controlling our network model, namely the link density y, the strength of the memory q, and its length p.In Fig. 1a we plot the fraction |I t |/N of infected nodes at time t obtained, for infectivity λ = 0.5, in a temporal network with N = 1000 nodes produced by a DARN(p) model with y = 0.002 and q = 0.9.This is compared to the case of a temporal network without memory, generated by setting q = 0 in our model.We can see that the infection spreading in the temporal network with no memory is faster than when any memory is taken into account (i.e. when q = 0).In the cases where memory is present, the infection spreading appears to depend heavily on both q and p.It is apparent that increasing the memory length p changes how long it takes for the infection to spread across the network, and that increasing q exacerbates this behaviour.We also observe that for large values of p the curves converge to that of the q = 0 case, and this is in agreement with the fact that the dynamics of our model in the limit of large p are the same as those of a random temporal network with no memory (see Appendix E).FIG. 1. Effects of the memory length of a temporal network on a spreading process.a, The average fraction of infected nodes over time for a disease spreading according to the SI model with λ = 0.5 on temporal networks with memory, as generated by the DARN(p) model in Eq. ( 1) with y = 0.002, q = 0.9 and various values of p.Each network has N = 1000 nodes.
Results are averaged over 10000 realizations of the process.b, The average time τ until all the nodes in a temporal network are infected, given an SI model with λ = 0.5, is shown as a function of the memory length p for y = 0.002 and y = 0.006, and q = 0.95 and q = 0.85.Each temporal network has N = 1000 nodes.Results are averaged over 100000 realization of the process.Resulting error bars are smaller than the markers.
In order to gain further insights into the spreading behaviour as a function of the memory variables we quantify the speed of the spreading by looking at the expected time taken until all of the nodes in the network are infected.As before, we consider a temporal network with N = 1000 nodes and we fix an infection spreading rate λ = 0.5.The SI model is run until the first time where all N nodes are infected.This value is then averaged over 100000 iterations of the process to return the average time τ until full infection of the network.Fig. 1b shows τ as a function of the memory length p, and for different values of q.For y = 0.006, we observe a monotonic decrease of τ as a function of p when q = 0.85.However, when we increase the strength of the memory to q = 0.95, the time taken until the entire network is infected shows a non-monotonic dependence of the memory length, with the presence of a maximum at p max = 2.When we further decrease the network density to y = 0.002, these maxima move to p max = 2 and p max = 3 respectively.

IV. THEORETICAL RESULTS
Our model is in essence a generalisation of the ER random graph model to the case of temporal networks with memory.As such, in our networks there are no correlations between pairs of different links.Since each link is independent, we can analyse the model by first analysing a single link.The dynamics of a link can be explained in terms of the transition matrix of the higher order Markov chain corresponding to the DAR(p) process.While the transition matrix for a first order Markov chain expresses the probabilities of moving between the possible states over a time step, namely here (0 → 0), (0 → 1), (1 → 0) and (1 → 1), a p-th order transition matrix expresses the probability of moving between p-states representing the possible histories of the system.If S t = (X t , X t−1 , . . .X t−p+1 ) is the p-state of our process X t at time t, then, for any α, β ∈ S, where set S is the set of all 2 p possible p-states, we can look at the probability Prob(S t+1 = β|S t = α).This defines the entries of the p-th order 2 p × 2 p transition matrix T αβ .To write down this transition matrix it is useful to introduce an ordering into the possible states of the system.Since there are 2 p possible states, we assign to each α ∈ S a unique label l(α) ∈ [0, 2 p − 1].To do this we note that by definition α = (α 1 , ..., α p ) with each α i ∈ {0, 1}.Hence, a convenient unique labelling is to take l(α) = p i=1 α i 2 p−i .For ease of notation, unless explicitly stated, α will refer to its label l(α).In this way we can write the p-th order transition matrix as: where h(x) is the Hamming weight of the number x (the number of 1s in its binary representation), δ(x, y) = 1 if x = y and 0 otherwise, and ⌊x⌋ is the largest integer value smaller than x.Note also that, for the sake of sim-plicity, this matrix is indexed from zero, not one.Taking two nodes, one of which is initially infected, we study the expected time taken for the second node to become infected.Since the infection process is modelled as a Bernoulli random variable Λ t ∼ Bernoulli(λ), the infection is passed at the first value of t such that Λ t X t = 1.We can cast this process as a Markov chain by considering a "dual-state" (S t , Λ t ), where S t and Λ t are the p-state of the link and its infectivity state at time t, respectively.Let us call the set of all possible dual-states S, then we note that S = 2 p+1 , and we set (α, ι) = α ∈ S, where α is defined as for the transition matrix in Eq. 2 and ι = 1 if an infection is passed and zero otherwise.The corresponding label function is then l(α) = l(α) + 2 p ι.We can then define the transition matrix P α β in block form as: where the sub-matrix T has elements given by Eq. ( 2).Consider the set A of dual-states where an infection is passed, i.e.A = α ∈ S : α1 = 1, αp+1 = 1 .We are now interested in how long it takes our system to reach a state in A. We can find the average hitting time τ α from each starting state α / ∈ A as the minimal solution of the following equations [35]: This is effectively saying that, if we start in state α, the average time taken to reach a state in A is the average time taken to reach a state in A from any β weighted by the probability of moving to β from α, plus the one time step it would take to make that move.This equation can be simplified (see Appendix F), then we can average over the initial states of the system to get: where the last term refers to the probability that the label of the initial p-state S 0 of the process X t corresponds to the label of the p-state α.
Eqs. ( 4) and ( 5) can be solved directly for small values of p (seeAppendix I for details).The plots in the two panels of Fig. 2a show not only that the theoretical predictions are in very good agreement with the simulations, but also that the non-monotonic behaviour observed in Fig. 1b for networks with N = 1000 nodes can emerge even in the case of a single link.In particular, we find that τ p has a maximum at p max = 8 when y = 0.03 and at p max = 6 when y = 0.07.Eq. ( 5) can be used to explore how p max depends on the values of the parameters y and q, and on the infectivity λ.In Fig. (2b) (upper panel) we see that, at fixed λ, the value of p max decreases with y, while it increases with the memory strength q.In this way, for small y and large q, the length of the memory p which produces the maximal value of τ p can be very large.For instance, when y = 0.01 and q = 0.95 we get p max = 13.Fig. (2b) (lower panel) shows that p max decreases with the infectivity of the SI process for each value of y.

V. PHASE DIAGRAM OF THE MODEL
We have found a solution to the average time taken for an infection to pass across a single link, and hence along any chain of links.Real world networks, however, often contain many paths between any two nodes, and these paths are often of varying lengths.Eq. ( 4) can be extended to deal with multiple paths of any length (see Appendix G).To do this, let us index each link in the network as ℓ, with ℓ = 1, 2, ..., N (N − 1)/2.We can then define the dual-state of the ℓth link as αℓ , as we did in the single link case.The dual-state of the entire network can then be written as α = (α 1 , α2 , ...).We can then generalise Eq. ( 4) in terms of P α β , which we refer to as the transition tensor, between any two network dualstates, as where A is the now the set of network dual-states where we stop our infection process.In principle, this equation can be used for networks with any number of nodes N , to study p max as a function of y, q, and λ, as was done in Fig. 2a, although computational constraints do not allow this for large N .The equation can be simplified to find the average passage time of an infection over multiple paths of the same length (see Appendix H).However, since we are interested in the most general case of multiple paths with different lengths, let us now focus on a network with N = 3 nodes, the smallest possible example of this type, having paths of length 1 and length 2 between any two nodes.Eqs. ( 6) can be used to determine when a maximum in the time taken to pass an infection between any two nodes in a temporal network with N = 3 should occur as a function of p.For each value of λ, by comparing the mean passage time for p = 1, 2 and ∞, we find sufficient conditions for the existence of a maximum of τ p at p other than 1 (see Appendix J).This defines a curve which breaks the (q, y) plane into two sections, the upper section being the one where τ p is non-monotonic.These curves are displayed for λ = 0.3, 0.5 and 0.7 in Fig. 3 for a single link (left) and for the N = 3 node network (right).The regions where a maximum must be present are clearly dependent on λ and on the number of nodes in the network.For N = 3 we observe that approximately half of the (q, y) plane must result in a maximum for λ = 0.3, however increasing λ reduces this fraction.For example, at a fixed value y = 0.5, when λ = 0.3 then nearly half of the possible FIG.2. Theoretical results for the spreading over a single link.a, The theoretical prediction of Eq. ( 5) is in good agreement with the simulated values of the average passage time τ of the infection over a single link as a function of the memory length p. Shown are cases with infectivity λ = 0.5, q = 0.95 and two values of y, namely y = 0.03 and y = 0.07.The average passage time τ as a function of p has a maximum at the value p = pmax, where pmax is dependent on y. b, The theoretical values for pmax are reported as a function of y for infectivity λ = 0.5 and for multiple values of q (upper panel), and as a function of λ for multiple values of y, when q is fixed to 0.95 (lower panel).
q values must produce a maximum, when λ = 0.5 this fraction decreases to approximately 0.2, and for λ = 0.7 then only 0.1 of the values of q must result in a maximum according to our criterion.The size of the regions where a maximum must be present in general decreases with the number of nodes in the network.Together with the shape of the curves in Fig. 3, this explains why in large and sparse networks, such as those considered in Fig. 1b, we observe a non-monotonic behaviour of τ vs p only for high memory strength q and low graph density y.

VI. CONCLUSIONS
Memory plays an important role in many processes in physics.In our networked world, interactions change in time.Such temporal changes must be taken into account when studying dynamical processes, be they the spreading of epidemics [23], the diffusion of ideas [36], the movement of people or patterns in broader social interactions [14].The model we have introduced is a simple way to include memory in a temporal network and can be further extended in many directions, for instance to include correlations among links, or to allow for different links to have different memory characteristics.The results we have obtained, and the methods developed in doing so, pave the way for a radical change in how we consider the influence of memory in networks, and highlights how unexpected the results can be when we do.For each value of the infectivity λ, the region above the curve denotes the values of the parameters q, y where there must be a value of p > 1 such that τ has a local maximum.The left panel refers to the case of a single link, while the right panel refers to the case of a DARN(p) model with N = 3 nodes.

CORRESPONDANCE
Correspondence and requests for materials should be addressed to O.W or V.L.This then allows us to solve Eq. (B1) for the autocorrelation function up to any finite time.Resulting autocorrelation functions are shown in Fig. 4 for various values of the parameters.
Consider Eq. 4, we know that for all j ∈ A we have j > 2 p , so this can be broken up into Looking at our matrix P in Eq. 3 we can re-write this as: As before we note that all the elements of A are memory states with leading value 1, so we can define the matrix T L ij = T ij if j < 2 p−1 and 0 else.This allows us to write Hence we only need to solve for the first 2 p values, allowing us to write the equation in matrix form as: Then, defining the matrix Φ as: We can write This greatly simplifies any calculation of average passage times for single links.

Appendix G: TEMPORAL NETWORKS DESCRIBED BY TENSORS
In this work we model each link in a network as a Markov chain, and so each link has a transition matrix associated with it.If each Markov chain has a state space S with |S| = σ then the temporal network with N nodes can be described as a Markov chain with state space S N (meaning the cartesian product of S with its self N times) with S N = σ N .Rather than attempt to directly impose an ordering on the states and generate a transition matrix we instead form a transition tensor of rank 2N .This tensor is formed by having one source index and one target index for each link in the network; each index represents a label for a state in S, and so varies from 1 to N (or 0 to N − 1).We then define the transition tensor T as With α and β the sets of source and target indices for each link.We use this approach to transform quantities derived from the transition matrix into a tensor form in the same way.When calculating the average hitting time of a Markov chain, we do so from some starting state, as labeled by a single index.For our new formulation, each state is labeled by a vector of indices, and so each possible starting state is labeled by a vector of indices.
In this way Eq.( 4) becomes as in Eq. ( 6).The two equations can be seen to become equivalent upon flattening Eq. (G2) so that T becomes an σ N × σ N matrix and τ becomes a vector of length σ N .

Appendix H: AVERAGE PASSAGE TIME FOR MULTIPLE LINKS IN PARALLEL
Solving for the average passage time over a single link allows us to extrapolate the average passage time along any number of links in series by using the linearity of expectation.Real systems however will often have multiple paths, eventually of different lengths.Here we examine the simplest case of m paths of unit length in parallel.Naively solving Eq. ( 6) for the case of m links in parallel requires us to handle a 2 m(p+1) × 2 m(p+1) matrix.However, in the same way as with the single link, we can greatly simplify this.We wish to find the average time taken for an infection to pass across any of m direct links from a source node to a target node.Let us start with the tensor equation Eq. ( 6).If we write αi = (α i , ι s i ) and βi = (β i , ι t i ) then T αβ Λ ι s ,ι t τ β .

(H3)
With χ being the indicator function.Due to the lack of dependence of Λ on its source state, we can write this as I1,I2 βi∈I1≤2 p−1 βi∈I2>2 p−1 T αβ τ β L(I 1 , I 2 ), (H4) for some function L(I 1 , I 2 ), by noticing that we only care about the value of ι t i if β i > 2 (p−1) .It is then straightforward to show that L(I 1 , I 2 ) = (1 − λ) |I2| .We can now take our definition of T L and define T R = T − T L and write Giving us an equation in terms of elements of a single 2 p × 2 p matrix.

FIG. 3 .
FIG.3.Phase diagram for the presence of a non monotonic dependence of spreading time on memory length.For each value of the infectivity λ, the region above the curve denotes the values of the parameters q, y where there must be a value of p > 1 such that τ has a local maximum.The left panel refers to the case of a single link, while the right panel refers to the case of a DARN(p) model with N = 3 nodes.

10 FIG. 4 .
FIG. 4. The autocorrelation function ρ k as a function of time k for q = 0.95 and various values of p.The function is flat for the first p time steps, and then decays exponentially.