Communicability in time-varying networks with memory

We develop a first-principles approach to define the communicability between two nodes in a time-varying network with memory. The formulation is based on the time-fractional Schrödinger equation, where the fractional (of Caputo type) derivative accounts for the memory of the system. Using a time-varying Hamiltonian in the tight-binding formalism we propose the temporal communicability as the product of Mittag–Leffer functions of the adjacency matrices of the temporal snapshots. We then show that the resolvent- and exponential-communicabilities of a network are special cases of the proposed temporal communicability when perfect (resolvent) or imperfect (exponential) memory are considered for the system. By using theoretical and empirical evidence we show that real-world systems work out of perfect memory, and with an interrelation between memory-dependent temporal communication and imperfect memory spatial transmission of information. We illustrate our results with the study of trophallaxis interactions in two ant colonies.


Introduction
A temporal network-also known as time-varying network-is an ordered sequence of graphs G 1 , . . . , G h , such that G i = (V, E i ), where V is a set of vertices or nodes and E i is a set of edges or links between pairs of nodes v, w ∈ V occurring at the time t = i. The aggregated graph of the temporal one is G = ∪ h i=1 G i with adjacency matrix A = A 1 + · · · + A h . Time-varying networks are relevant in the analysis of person-to-person communication, physical proximity, biological networks, distributed computing, infrastructural systems, and ecological systems, among others. The topic has been extensively reviewed in the literature [1][2][3][4].
An important question in the study of temporal networks is about the way in which information flows through the network. Obviously, to the complexities of the information flow through the nodes and edges of a static graph, we are adding now the temporal dimension. Let us make the following artificial division of the kinds of information flow that may exist in a temporal network. In a fixed time frame we can consider that the nodes are frozen in time and that they occupy a given position in the 'network space'. This network space could be a geographic space or any sort of mathematical space. Therefore, when two nodes communicate in the same time frame we will say that there is a spatial communication between the two nodes. If we now consider more than one time frame, the communication between a node in one time frame a with a node in another time frame b > a occurs by moving ahead in time (although it also may occur through two space positions). We will then call this communication through-time.
Therefore, information flows both through space (at static snapshots of the network) and through time (between the static networks). To capture this interrelation between temporal and spatial communication among nodes, the concept of temporal communicability was proposed some time ago [5,6]. The first of these proposal was an ad-hoc definition based on the resolvent of the adjacency matrix [5]: where γ < min i (λ 1 (A i )) −1 is a parameter, λ 1 (A i ) is the spectral radius of the corresponding adjacency matrix, I is the identity matrix and ψ w is the corresponding column of the matrix V T in A = VΛV T in which V is the matrix of orthonormalized eigenvectors and Λ the diagonal matrix of eigenvalues of A. The second definition was based on physical principles assuming a quantum walk on the graph with a tight-binding Hamiltonian (see next section). The definition of this index is [6] G vw (t) := ψ w | e βA 1 e βA 2 . . . e βA h |ψ v , (1.2) where β is a parameter representing the inverse temperature of a thermal bath in which the network is submerged.
The resolvent and the exponential of the adjacency matrix are both matrix functions expressible as power series of the corresponding matrix and penalized by certain factors. However, when applied to a temporal sequence of graphs there are some differences in the way they treat temporal 'walks'. This has prompted the authors of [7] to claim that the resolvent-based communicability-aka Katz communicability-is the 'combinatorially correct expression' and that 'there is no way to correctly compute the combinatorics of exponential centrality on time-evolving graphs'. However, the exponential-communicability is based on the 'product formula' for the propagators in quantum mechanics [8], which is a well-established and foundational approach in physics. Additionally, the product of exponentials naturally emerges in the techniques of 'product integration' in functional analysis [9,10], which use it for solving linear differential equations with time-varying Hamiltonians.
Here we show that both communicability functions are particular cases of a more general theoretical framework corresponding to a fractional-time process taking place on the temporal network. Following previous empirical and analytical results we show here that such fractional-time process corresponds to the consideration of memory effects on the temporal network. That is, we show that resolvent-and exponential-communicabilities are solutions of the time-fractional Schrödinger equation (TFSE) when there is a perfect memory (resolvent) or when the system is allowed to forget (exponential). We also provide evidence based on theoretical and real-world grounds that points out to the fact that real-world systems do not operate on perfect-memory regimes. We find here that the interactions between nodes at relatively large temporal separation are the only ones affected by the memory capacity of the system. On the contrary, spatial interactions occurring in the same (or temporarily close) time frames are favored by low memory capacity of the system.

Time-fractional Schrödinger equation
The use of physical metaphors is frequent in network theory to establish links between topological and dynamical properties on networks and well-established physical concepts. One of such physical metaphors is the use of tight-binding Hamiltonians (TBHs) to study network properties. Although such physical model comes from the study of electronic properties of molecules and solids [11], it is used in network theory without the necessity of considering that electrons are really moving through the nodes and edges of the network [12][13][14][15][16][17][18]. For instance, Sade et al [12] studied the spectral statistics of complex networks and relate them via a TBH with features of the Anderson metal insulator transition for a wide range of different networks. Similarly, Zhu et al [13] studied the structural characteristics of complex networks using the representative eigenvectors of the adjacency matrix and found with the use of the TBH that the networks have nontrivial localization properties due to the nontrivial topological structures. Berkovits et al [14] described networks by a TBH, which was used to determine the properties of the Anderson transition according to the statistical properties of its eigenvalues. They concluded that the use of this approach on new complex topologies of networks lead to novel physics, specifically that clustering may lead to localization. In another work, Xie et al [15] studied eigenvalues of networks and related them to the full energy spectrum of a TBH model defined on this structure. They used this association, to evaluate further properties of the eigenvectors, like the degree of quantum localization of the tight-binding eigenstates. Yang et al [16] used two ways of quantum mapping of complex networks, and analyze the localization properties of information on the maximum connected graphs also using TBH. Another interesting application of TBH to network was provided by Esfandiary et al [17] who were inspired by Anderson localization in quantum systems to relate the localization of neural activity to the network spectrum and to the existence of an anomalous 'Lifshitz dimension'. This connection, which is important for understanding brain functionality illustrates the power of using physical metaphors to extract information from networks of impact across the disciplines. Finally, we mention here the work of Ç alişkan et al [18] who investigated numerically the transport properties of modified small-world networks using TBH and one-electron Green's function method.
Let us then use the TBH formalism analogy here and consider an 'item' moving between the nodes of a network G 1 = (V, E 1 ) as described by the following Hamiltonian where ε v is the attractiveness of the node v to the item, γ vw is the inter-nodal attractiveness, A 1,vw is one if the edge v, w exists or zero otherwise and |v is a column vector having zero in every entry except in that corresponding to node v where it has one ( v| is the transpose of |v as usual). When Naber [19] (see also [20]) introduced the fractional-time Schrödinger equation he proposed two possible forms of writing it: and decided for the first due to two main reasons: (i) when performing Wick rotation the imaginary unit is raised to the same power as the time coordinate, and (ii) when solving for the time component of it the temporal behavior of the solution will not change while for the second equation, changing the order of the derivative moves the pole to almost any desired location in the complex plane. The problem that we see with the first equation is that when we perform the Wick rotation we will have (−β) α , where β is the inverse temperature (see further). Therefore, as 0 < α 1 we will have complex contributions from this temperature term. Also, when we consider the TBH in which typically ε v = 0 for every node and γ vw = −γ for every pair of nodes, such that H = −γA we will no longer have the sign cancellation which occurs with the integer-time Schrödinger equation, i.e., ∂ ∂t ψ (t) = −iHψ (t) = iγAψ (t). Consequently, here we propose the following modification to the fractional-time Schrödinger equation which by taking = 1 is given by D α t ψ (t) = − (i α ) Hψ (t). Therefore, we will have the same solutions for the probability densities |ψ (t)| 2 as with Naber equation [19], we will not have complex temperatures and we will have sign cancellation in the TBH as with the integer-one equation as we will see in the next paragraph. Let us then take H i = −γA i . Then, the dynamic of the system is controlled by the TFSE where we replaced the Hamiltonian by its expression in terms of the adjacency operator: where (κ − 1) < α κ and κ = α [21]. In the following we will consider always the case κ = 1.
We can obtain the time-evolution operatorG α (t) as is the Mittag-Leffler matrix function of ζM [22,23]. Now, let the system evolves for all t ∈ [t 1 , t 2 ] according to the equation Then, the solution of this system is ϕ (t 2 ) = E α (i[t 2 − t 1 ]) α γA 2 ϕ (0), which by replacing the expression for ϕ (0) is written as (2.9) Therefore, we have First, we consider that t 2 − t 1 and t 1 are equal to τ , such that, The process can be extended to any number of time steps controlled by time-independent Hamiltonians, such that We then consider the replacement β → it to transform the real-time propagator K j = E α (iτ ) α A j into the thermal propagator T i = E α β α A j , which is also known as the Boltzmann operator or the imaginary time propagator. Here β is the inverse temperature of a thermal bath in which the network is submerged.
Finally, using the properties of the transpose of the product of matrices as well as the fact that (E α (M)) T = E α M T [23] we obtain the communicability between the nodes v and w in a temporal network asĜ Notice that if the network is undirected A T i = A i . Therefore, we have the following results: 14) • When α = 0, 0 < γ < min i (λ 1 (A i )) −1 and β ∈ R + \0, Notice that when α = 0 the effect of the inverse temperature completely disappear. Moreover, γ, which is the node-node interaction parameter, should not be confused with β as incorrectly assumed in [7] (the reader is referred to [24] for the physical foundations of these communicability functions).

'Memory' in the context of network fractional dynamics
Fractional derivative is a generalization of standard integer-order one, which originated from a mathematical curiosity discussed between L'Hôpital and Leibniz in 1695 [25]. The question about the physical meaning of the fractional derivative has been standing since in 1974 it was directly posted as an open question [26] (see also [27]). Although several answers to this question have been provided, as usually occurs for a given mathematical object, we focus here on its interpretation as a measure of memory in the system modeled by the fractional derivative. Recently, Du et al [28] have used empirical data of physical processes which are known to have memory and observed that a memory process usually consists of a short stage characterized by permanent retention, and a second one governed by a simple model of fractional derivative. They used numerical methods to show that the fractional model perfectly fits the test data of memory phenomena in different disciplines, ranging from mechanics to biology and psychology, where the parameter α accounts for the memory capacity of the system. In the particular case of the Caputo formulation of fractional derivative, Di Giuseppe et al [29] have used the fractional order derivatives to represent memory formalism which has then been confirmed by a number of laboratory experiments performed by the same authors. More recently Caputo and Cametti [30] have reviewed the area on modeling the transport processes of drugs across the intact human skin by a memory formalism based on the fractional derivative approach. In this work the authors conclude that although the use of fractional derivatives to model memory effects is completely phenomenological, 'a number of authors have found that this approach can provide a better comparison to experimental data and that this technique may be alternative to integer-order derivative models'. Sketch of the interpretation of fractional derivative as the memory of a system. A time-dependent function ψ (t) is plotted as a function of time. The first derivative of that function at time t is the slope of ψ (t) at that point, which represent the 'present' state of the system. Therefore, all times before t correspond to 'past' times. We can consider an arbitrary point, here marked by zero as the 'remote' past. This time is the starting point of the integration in the Caputo derivative. As the Caputo derivative takes the integral between 0 and t of weighted derivatives, we are calculating areas below ψ (t) with given weights, which increases from the remote past (more pale color) to the present (more intensive color).
Here we provide a further mathematical insight about how the Caputo fractional derivative accounts for the memory of a system. To warm up we start by writing the Caputo fractional derivative for α ∈ (0, 1) as such that it can be interpreted as a weighted mean of the first derivative ψ where the weight is given by [29,30]. First, let us start by defining what we understand here by memory. We then use the trapezoidal method to approximate the Caputo fractional derivative. Let us consider a time-dependent function ψ (t) which is plotted as illustrated in figure 1. The time at which we take the first derivative of the function (marked with solid red circle) is the present time. Therefore, every time before it, represents the 'past'. The Caputo derivative integrates the weighted derivatives from the remote past to the present, giving more weight to the present than to the past. Therefore, we understand here by memory the capacity of the Caputo derivative to account for past time with a weight different from zero. If such past is not taken into account as in the simple first derivative, we consider that no memory at all is considered. We will explain this mathematically in the next paragraphs.
We now state the following result due to Odibat [31].
approximates the Caputo fractional derivative as

3)
where Now, what is relevant for our current analysis is the following. The first term in the squared parenthesis of equation (3.2) corresponds to the initial time in the integration region, while the last one corresponds to the last time. Let us say that τ = 0 is the remote past, τ = t is the present, and all the intermediate values are a transition between the past and the present. When α = 1, the term as well as the terms in the summation. This means that the Caputo fractional derivative is considering neither of the past times in the derivation. Let us now consider 0 < α < 1. In this case neither of the two terms representing the past history of the system are equal to zero, which means that the fractional Caputo derivative is accounting for the recent and remote past of the system. The influence of the recent past decays monotonically with α as can be seen in figure 2(a), but it also depends on the proximity of the time frame to the present. For instance, when the time frame (here accounted for by j) is close to the remote past, e.g., when j = 1, the intermediate values of α give significantly higher weights to this contribution than to those given by the time frame is close to the present (here taken as 100), e.g., j = 90. In contract, for the extreme values of α, i.e., close to zero or close to one, the weights given by both, the remote past and the present, are similar to each other (see figure 2(a)). In closing, the memory effect accounted for by intermediate values of α is significantly more marked for time frames close to the remote past than to those given by the present. On the other hand, the influence of the remote past decays monotonically with the increase of α as illustrated in figure 2(b), indicating that the highest weight to the remote past is given by the smallest values of the memory parameter. When α = 0, the term (k − 1) 2−α − (k + α − 2) k 1−α ψ (0) = ψ (0). This implies that the Caputo fractional derivative with α = 0 is giving exactly the same weight to the remote past than to the present one.
In the solution of fractional dynamics considered here: , (3.4) the number of walks A k vw of length k between the nodes v and w are penalized by the reciprocal of the Gamma function, which depends on α as well as on k. We recall that a walk is a sequence of (not necessarily different) consecutive vertices and edges in the graph.
Then, to understand what 'memory' means for these network dynamics, let us pick a single walk of length k between two arbitrary nodes of G. The contribution of this walk to the ML function of A is c (α, k) = 1/Γ (αk + 1). That is, instead of counting one walk we have counted the fraction c (α, k) of walks. We can interpret this fraction c (α, k) as follows. Suppose that we train an 'item' to complete a walk of length k between two specific nodes in G. How many repetitions of the training do we need for the particle to 'remember' that walk? Obviously, it is easier for an item to remember a shorter than a longer walk, similarly as it is easier for us to remember a shorter than a longer sequence of characters. However, this number of trials will also depend on the inherent 'forgetting capacity' of the item. The smallest forgetting capacity the easier to retain the information on a given walk. We can identify this forgetting capacity with the fractional parameter α, such that for small α it should be easier for the particle to remember the corresponding walk.
Therefore, we can identify c (α, k) as the memory of the item, such that for long walks, i.e., k → ∞, and large forgetting capacity, i.e., α → 1, the item has very little memory, i.e., c (α, k) → 0. The relation between the two parameters for the item memory is illustrated in figure 3. In dark blue there is a vast region of memoryless items, which will need a large number of trial repetitions for remembering a given walk, particularly for longer ones. At the other extreme we have items with perfect memory, i.e., zero 'forgetting capacity', which will need only one trial to learn a given walk independently of its length. Notice that  (α, k) identified here as the memory of the particle. When c (α, k) → 0 the particle has very little memory, and when c (α, k) → 1 it has a large memory capacity.  [42] c (α, k) can be bigger than one, which occurs when Γ (αk + 1) < 1. Because the Euler gamma function takes values below one only for 1 (αk + 1) 2, we have that c (α, k) > 1 only when α < k −1 , this is the case for instance for c (0.2, 2) ≈ 1.127. This is equivalent to consider that the item needs less than one trial to learn a path of length 2. In other words, the item 'infers' the walk before completing a first trial. For large values of k the system needs to have an almost perfect memory for being in a similar situation to the one described before, e.g., for k = 100 the value of α should be below 0.01. In terms of the temporal communicability measures these results means that: represents a perfect memory temporal walk, where nothing is forgotten from the past history of the system; • G vw (t) := ψ w | e βA 1 e βA 2 . . . e βA h |ψ v represents a memoryless temporal walk where little from the past is remembered. Obviously, both measures of temporal communicability are extreme cases. The only way to know about which values of α are more appropriate for a given problem is by empirically fitting the results of the simulations to data. As mentioned before, this approach has been studied recently for several types of real-world systems [28]. Here we extend this search to gain some insights about the values of α more frequently appearing in real-life situations from the analysis of the literature using different kinds of fitting processes (see references in table 1 for details). In table 1 we illustrate some of the results reported in the literature for several kinds of problems.
As expected, dynamics on complex systems ranging from porous media, tumors and populations (see table 1) do not behave as having 'perfect memory'. With the only one exception of the study [33], the values of the best fitted fractional parameter α for different real-life problems is between 0.5 and 1. These results clearly point out to the lack of physical support to the use of the temporal communicability with α = 0, and points out to the importance of using the new definition given here for 0 < α 1.

Temporal evolution with 'memory'
Let G 1 , G 2 , . . . , G n−1 be a temporal network such that G i = (V, E i ) where V = 1, 2, . . . , n and E i = {i, i + 1} as illustrated in figure 4(a). This is the simplest temporal network in which only an edge is 'moving' from one time frame to the next. Therefore, it will allow us to evaluate mainly temporal effects of memory as there is little transmission of information between nodes in the same time frame, i.e., no space transmission as understood in this work.
Let γ = 1 and let us define the following functions: We can now express analytically the communicability matrix of this temporal network as [6] Let us focus on the communicability between the node labeled by 1 and the rest of nodes in the network. We then normalize by the sum of the corresponding row of G, which is a 1−b n−1 , β) are: This result allows us to understand what the memory effect means for the transmission of information in a temporal network. Let us first fix β = 1 and later we will analyze the effect of 'heating' and 'cooling' the system. When β = 1 we have that for n → ∞,G 1,j → ab j−n − ab j−n−1 . Therefore, when b is relatively largẽ G 1,j → 0 for all j = n, whileG 1,n → 1. This is particularly the case when α → 0. This means that when α → 0 the system 'does not forget' anything and all the information starting at the node 1 is completely delivered to the node n at the end of the chain. However, when α = 1, the values of a ≈ 1.54 and b ≈ 1.17 makes thatG 1,j does not vanish for j < n, which means that the system 'forgets' some part of the information, leaving it at the nodes along the chain. Consequently, the system delivers to node n an amount of information significantly smaller than the one initiated at the node 1. In other words, it has forgotten a large bunch of information along the temporal path. In the case of the aggregated graph, i.e., where the adjacency matrix is A = h i=1 A i , we have the following result.
be a fractional generalization of the modified Bessel function of the first kind. Let P n be a linear chain (path graph) of n nodes labeled in consecutive ordering 1, 2, . . . , n. Then, the fractional communicability between a pair of nodes p and q is Proof. By considering the eigenvalues and eigenvectors of the adjacency matrix of P n we have for β = 1 Then, for sufficiently large graphs we can approximate the summation as an integral between 0 and π, such that we replace θ = jπ/ (n + 1) = j dθ. Then, we havê which proves the result.
Obviously, E ν,α=1 (z) = I ν (z), which is the modified Bessel function of the first kind. In figure 5(a) we plot the functionĜ 1,j for different values of α. As can be seen, all communicability functions deliver most of the information to the node n − 1 in the linear chain. The main difference is that when the memory capacity is high, small values of α, the probability that the information is delivered to that node is significantly higher than when α = 1. In the last case, significant portions of the information are retained at the nodes along the chain. In figure 5(b) we also plot the results for the aggregated network. In this case the result are significantly different from a qualitative point of view. Here most of the information is retained at the initial node, with the exception of when α 1, in which case most of the information is delivered to nodes in the middle of the chain.

Relation between spatial and temporal communication
Let us consider first the temporal path we have studied before. In this case there is mainly temporal transmission of information because at a given snapshot of the temporal network there is only one edge between two nodes. We then study the values ofG v,w (α, β) for all v ∈ V for two values of α, one representing high memory capacity, e.g., α h = 0.25, and the other representing low memory capacity, e.g., α l = 1.0. We then obtain the values of the following difference: (5.1) Figure 6. Contour plot of G matrices for the temporal linear chain (a) and for the temporal evolution of triangles (b).
A positive value of G vw indicates that the corresponding pair of nodes v and w communicates better when the system has more memory capacity than when they have low memory. In other words, that the communication between these two nodes is degraded when memory drops. This is exactly what it is observed for the communication between almost every node in the temporal path and the nodes at the end of the chain. That is, as observed in figure 6(a), the values of G v, 19 and G v, 20 for v < 19 are positive, indicating that such communication, which is mainly through-time, is dramatically affected by memory. In other words, memory is very much needed for purely temporal transmission of information. On the other hand, the rest of the values of G vw are negative, which indicates that the communication between these pairs of nodes is enhanced when the memory of the system is low. The lowest values of G vw are observed for the pairs (17,18) and (18,19), although the absolute minimum is for the pair (18,18). In general, it is observed that the difference in self-communicability, i.e., G vv , of the nodes close to the temporal end of the chain are the ones most affected by memory. Therefore, the values of G 18,18 , G 17,17 , G 16,16 , G 15,15 , are among the most negative ones. The communication between pairs of nodes close in time is mainly dominated by spatial factors more than for the temporal ones. That is, the pair (1, 2) is in the same time frame and the node 1 is at only one time step from the node 3. Then, the communication between the nodes 1 and 2 is affected in a large way by their spatial proximity, as well as that between 1 and 3, which makes that G 1,2 and G 1,3 are negative. However, node 1 is very much affected by memory in its communication with nodes 19 and 20, which makes that the values of G 1,2 and G 1,3 although negative, are close to zero. On the contrary, nodes 17 and 18 are close to the end of the temporal chain, which does not require high memory for them to communicate in time. Therefore, the communication between these pairs among them is mainly dominated by the spatial effect with little influence of the temporal factor.
To further investigate the effects of spatial communication in a temporal sequence we design the following temporal network. Instead of an edge moving across time we consider now a triangle as illustrated in figure 4(b). In this case, the information can hop between temporal snapshots as before, but also can propagate through the nodes of the triangles at a fixed time. Therefore, we have temporal and spatial transmission of information as defined in this work. The plot of G for this graph is illustrated in figure 6(b). The main difference observed here is quantitative. The values of G v,w where w is at the end of the temporal chain, are also positive, with the highest values for the pairs G 1,18 = G 1,19 = G 2,18 = G 2,19 . However, here the values of G v,w are significantly smaller than in the case of the temporal chain. Similarly, the most negative values are among the pairs of nodes close to the end of temporal sequence, but their absolute values are significantly smaller than in the case of the temporal linear chain. In closing, it seems that the inclusion of spatial communication inside the same temporal snapshot of a temporal network drops significantly the influence of memory in both the short and long-temporal communication between pairs of nodes. This may be a plausible explanation to the observed fact (see table 1) that most of the values of α observed in the real-world are not so small, i.e., relatively little influence of memory.

Temperature effect
Now, let us consider the effect of the inverse temperature β. In an ideal situation of no noise in the system (see [43] for the effects of noise on memory) we can consider that T → 0 (β → ∞). In this case, for j n − 1,G so that for j < n − 1,G 1,j → 0. When j = n − 1 and j = n we have, respectivelỹ That is, in the ideal situation of no external noise the system behaves with perfect memory delivering 100% of information to the last two nodes of the temporal chain. What happens is that the information remains trapped between nodes n − 1 and n since the spatial mobility of information is still allowed in the last time frame of the temporal network. We have an equi-distribution between the two ending nodes (1/2 and 1/2, in the limit β → ∞), as if what matters is the last edge more than the last node n.
However, if the level of noise is too high, i.e., T → ∞ (β → 0) thenG 1,n → (β α ) n−1 , which means G 1,1 → 1 for β α = 0 and the rest of nodes receives no information at all. That is, the system completely forgets to deliver the information, which is retained at the starting point.
The rate at which the system forgets with the increase of the temperature depends on the memory capacity it has, i.e., the value of α. Here we studyG 1,n−1 (α, β) as a function of both parameters α and β. As can be seen in figure 7, for high temperature, β → 0, the information delivered to the node n − 1 of the temporal chain is practically null. This confirms the intuition that high levels of noise degrades significantly the transmission of information across a temporal chain. The plot also confirms that the degradation of information is smaller when the memory capacity of the system is large, i.e., when α → 0.

Real-world example
Here we consider two temporal networks corresponding to the trophallaxis (oral exchange of food) interactions between black carpenter ants, Camponotus pennsylvanicus, in two different colonies (see [44] for details on data collection). We selected these datasets due to two main reasons. The first is that social insects are paradigmatic examples of self-organized complex systems. The second is because trophallaxis represents real social interactions in the colony more than proximity-only ones. These social interactions, which have revealed evidence for 'organizational immunity' in colony food flow, may be influenced by memory effects. The term organizational immunity was coined by Naug and Smith [45] to describe the phenomenon of how social organization might interact with epidemiological variables such as infectious period to create different risk categories within a social group. As they have also explained this concept is related to that of herd immunity where the immunological status of the majority reduces transmission of the pathogen to the few remaining susceptible individuals. That is, due to the different roles observed by ants in the colonies, e.g., (active and inactive) foragers, nest workers and queen, we hypothesize that their interactions follow specific patterns in time, which should be 'remembered' by ants.
The first temporal network consists of 41 ants studied during eight consecutive nights, and the second one consists of 39 ants studied in a similar way as the first. First, we obtain the values of G v,w as defined before but using the values of α h = 1 and α l = 0.4 (the value of α l below 0.4 produced numbers which are too big to be manipulated properly) for every pair of nodes in the two temporal networks. The results are   G v,w for every node and observe that there are 16 nodes in colony 1 and 12 nodes in colony 2 for which G w > 0. We remind that such nodes are the ones that tend to be located at the end of temporal chains (see previous section). In other words, these nodes play the role of receivers of the 'information' transmitted through time in the network. Consequently, the rest of the nodes where G w 0 are mainly transmitter of this information or remain neutral (neither transmitter nor receivers) in the information-passing process.
To further investigate the interactions among these two classes of nodes we split them according to the sign of G w for every time frame of each of the two colonies. In figures 9 and 10 we illustrate the networks at each temporal snapshot, representing the nodes with G w > 0 in red. The interactions between temporal spreaders (nodes in which G w < 0) and temporal receivers (nodes in which G w > 0) are represented by broken lines. Those of the spreader-spreader type are in blue and the receiver-receiver ones in red. The fact that temporal receivers must receive most of the information from spreaders is confirmed by the observation that the percentage of receiver-receiver interactions is very low in every time step of both colonies. In colony 1, only 15.6% of all interactions are of receiver-receiver type as average among all eight time frames (figure 9) Contrastingly, 38.5% of all interactions are between spreaders and receivers as average in the eight time frames, and 45.9% are spreader-spreader ones (figure 10). The results for the colony 2 are very similar to those of the colony 1. For instance, in colony 2 the percentage of receiver-receiver links is only 16.9% as average among the eight time frames, while those of spreader-spreader is 43.3% and spreader-receiver is 39.7%. The exception is the last temporal snapshot where receiver-receiver interactions occur in 41.7% of the times, versus 37.5% of spreader-spreader and only 20.8% of spreader-receiver. This is expected as this is the last temporal snapshot and no further temporal transmission is expected. While the spreader-receiver is a truly temporal transmission and so it is very affected by a drop of the system's memory, the spreader-spreader and receiver-receiver ones are mainly spatial transmission, which occur in the same time frames.

Conclusions
According to our results, based on first-principles physical grounds, there is a relation between temporal and spatial effects of system's memory in the communication between nodes in temporal networks. While temporal interactions between nodes with relatively large separation in time are favored by large system's memory, the spatial interactions between nodes in the same temporal frame or in those close to each other in time, are favored by the capacity of the system to forget (low memory). We have used here this interrelation in order to identify groups of nodes according to their main communication role in the temporal network. By using the difference between communicabilities at high and low memory capacities for every pair of nodes we identify those nodes which mainly act as receivers of information at the end of temporal chains in the system. These nodes have very little communication among them, and they are mainly communicated with those acting as spreaders of information. Spreaders communicate intensively between them using spatial communication and with receivers using temporal communication.
Our work also demonstrates that there is nothing like a 'combinatorially correct expression' in order to calculate the communicability of a network. Both, exponential-and resolvent-communicability, are special cases of generalized communicability in systems with memory. The first corresponds to a system with high capacity of forgetting and the second to a system with perfect memory. Such systems with perfect memory seems to be far from realistic real-world scenarios as we have shown here from both analytical and empirical evidence. This work reveals the fact that sometimes what is needed in exploring network concepts are more insights, not numbers, to paraphrase the saying of Coulson 'give me insights, not numbers'.