Network Dynamics on Graphops

In this brief note, we report a formal mathematical observation: we are about to breach a major century-old barrier in the analysis of interacting particle systems. More precisely, it is well-known that in well-mixed/homogeneous/all-to-all-coupled systems, one may derive mean-field limit equations such as Vlasov-Fokker-Planck equations (VFPEs). A mesoscopic VFPE describes the probability of finding a single vertex/particle in a certain state, forming a bridge between microscopic statistical physics and macroscopic fluid-type approximations. One major obstacle in this framework is to incorporate complex network structures into limiting equations. In many cases, only heuristic approximations exist, or the limits rely on particular classes of integral operators. In this paper, we notice that there is a much more elegant, and profoundly more general, way available due to recent progress in the theory of graph limits. In particular, we show how one may easily enter complex network dynamics via graphops (graph operators) into VFPEs.


Introduction
Interacting particle systems, or more generally, dynamical systems on graphs/networks, form one of the major building blocks of modern science [36,2]. Within the last three decades, they have permeated virtually all disciplines, ranging from molecular scales [13] to neuroscience [33], systems biology [28], machine learning [17], social science [31], epidemiology [3], and transportation networks [29] up to climate science scales [12]. For all-to-all coupled systems, lattice systems, and various special classes of network dynamics with similar dynamics at each vertex, there is a well-developed theory to pass to a mean-field approximation in many theoretical frameworks [15,19,22,23,26,32]. One considers the limit of an infinite graph [35] n → ∞ so the particle/vertex number tends to infinity. If one is interested in the probability density ρ = ρ(u, t) to find a vertex in a certain dynamical state u ∈ R k at time t, one may often derive a differential equation for ρ. One common form for k = 1 is ∂ t = −∂ u (ρV (ρ)), (1) where the map V can be derived from the dynamics of each vertex [15,14]. If each vertex also is influenced by Gaussian noise, a second-order term ∂ uu (·) commonly appears as well in (1). The equation (1) is sometimes referred to as Liouville equation, continuity equation [34], and/or Vlasov equation, while the second-order equation is more known as Fokker-Planck and/or Kolmogorov equation. Here we shall refer to this class of equations as Vlasov-Fokker-Planck equation (VFPEs).
Now it is evident that one would like to derive a far more general form of (1), which also takes into account cases, where the adjacency matrix A = A (n) ∈ R n×n of the interaction between finitely many vertices is not just the the matrix of a full graph nor a highly symmetric structure such as a lattice. There have been many recent steps forward to achieve this goal. Several mathematical approaches have been successful providing rigorous proofs for VFPEs, where nonlocal integral terms appear to take into account the heterogeneous coupling structure [7,23]. However, a general theory, which would allow us to upgrade easily from particular cases or standard all-to-all mean-field limit VFPEs, to modern complex network structures is still lacking.
In this note, we propose that a path to achieve an elegant extension of all classical approaches is to incorporate an analytical approach to graph limit theory. Graph limit theory [24], i.e., taking n → ∞, has made significant progress in recent years. New limit structures for dense graphs via graphons [25], as well as for sparse graphs [4,5,16], have recently appeared. The theory is relatively technical and convergence notions are often hard to understand as they have relied on combinatorial structures [24]. Recently a new approach to unify and extend graph limit theory was proposed by Backhausz and Szegedy [1]. Their idea relies on viewing the adjacency matrix A (n) from an operator-theoretic perspective. In the simplest setting, one takes a vector v in R n and consider the 2 × n matrix formed from v and v ⊤ A (n) . Sampling the columns of this matrix uniformly at random generates a measure µ v . Hence, one may view graphs also via their associated measures. Different graphs, even of different sizes, can then be more easily compared as we only view them through their action; see Section 2 for more details and a brief review of the class of graphops (graph operators), which we shall rely on. Backhausz and Szegedy show that in a suitably chosen metric, large classes of graphs do have non-trivial subsequential limits where A (∞) can also be viewed as an operator, but now acting as an abstract operator on a function space. We are going to show (formally) that exactly this viewpoint is the missing ingredient to start a general theory of VFPEs, and related classes of evolution equations, as limits of finite-dimensional network dynamics.
We are going to illustrate our reasoning primarily on a benchmark class of interacting particle systems, the famous Kuramoto model of coupled oscillators [22,34]. This model is widely studied and has served as a benchmark case to understand self-organizing and adaptive nonlinear synchronization. It can be embedded into a more general class of kinetic-type models. We shall see, how these kinetic models also fit elegantly together with graphops. This approach really seems to break a barrier that has held back the application of tools from dynamical systems, functional analysis, and evolution equations to broad classes of large-scale network dynamics problems.
The paper is structured as follows: In Section 2, we review the relevant parts of graphop theory. In Section 3, we show how this theory also provides an elegant framework to deal with network dynamics on a finite-dimensional level, i.e., for finite graphs. For example, we show that the graphop viewpoint provides a completely natural way to intertwine in one set of equations the dynamics on the network and the correlation structure between network elements. Then we proceed to formally develop, why graphops do generalize previous approaches to VFPEs in Section 4. We conclude in Section 5 with an outlook towards future opportunities and challenges.

Graphs as Operators
We build upon the theory of graphops from [1]. Let (Ω, F , µ) be a probability space and let L p (Ω, R) = L p (Ω) denote the usual Lebesgue space for p ∈ [1, ∞]. A linear operator A : Key examples of P-operators are matrices, which we want to view as adjacency matrices of graphs. Indeed, consider Ω = {1, 2, . . . , n} =: [n] for n ∈ N with the uniform measure µ [n] and sigma-algebra given by the power set 2 [n] . Then a vector v ∈ R n defines a function v : ) any matrix A ∈ R n×n is a P-operator acting on (row) vectors via the usual rule Obviously, we can identify L ∞ ([n]) and L 1 ([n]) with R n and finite-dimensional linear operators given by matrices are bounded in the associated operator norm.
A key advantage of general P-operators are their convergence properties, when one considers sequences of these operators via profiles. As an example, we consider again A ∈ R n×n and any (row) vector v ∈ R n , then we have vA ∈ R 1×n so that we may form a matrix M ∈ R 2×n with rows v and vA. Now we sample columns of M uniformly, which yields a probability measure µ v on R 2 . The (1-)profile of A is given by the collection of measures {µ v : v ∈ R 2 }. One may generalize the notion of a 1-profile to a k-profile S k (A) for a general P-operator where one uses k vectors v 1 , v 2 , . . . , v k as the first k rows and v 1 A, v 2 A, . . . , v k A as k further rows. Hence, the matrix M becomes a matrix of size R 2k×n ; wlog one may restrict v j ∈ [−1, 1] n as the results do not change upon scaling vectors. A random column in M yields a probability measure on R 2k and S k (A) is the collection of all such measures.
For a general P-operator A, be the joint distribution of the 2k-tuple, which can be viewed as the pushforward (T k,A ) * µ of the measure µ under the map where x ∈ Ω. The k-profile S k (A) is the set of all probability measures of the form (2), where we go through all possible k-tuples of functions in L ∞ (Ω, [−1, 1]). In summary, the two crucial ideas are to view more analytically graphs as operators and to use profiles to compare matrices/graphs/P-operators regardless of their size. Let A, B be two P-operators. One defines a metric where d H is the Hausdorff distance for X, Y ∈ P(R k ) (i.e., X, Y are subsets of probability measures on R k ) and d LP is the Lévy-Prokhorov metric. This metric is actually quite natural and given for where B k is the Borel sigma-algebra on R k and U ε is the set of points having distance less than ε from U. It is well-known that convergence in the Lévy-Prokhorov metric essentially is equivalent to weak convergence of measures so to actually ensure the existence of non-trivial limits, it is well-balanced choice between retaining some complexity in the limit, yet still having some form of mean-field object.
The definition of P-operators generalizes in a natural way to A p→q . One may metrize the space of P-operators via and call the associated convergence notion action convergence. Then one can prove [1] that any sequence {A ( j)} ∞ j=1 with uniformly bounded norm · p→q for p ∈ [1, ∞) and q ∈ [1, ∞] has a limit [1, Thm. 2.14]. This convergence is seen to generalize the theory of graphons and even graphings (classical objects used for very sparse graphs) as well as many intermediate cases. For example, if we have a graphon [24] W : Ω × Ω → R with finite norm then we can define an associated P-operator by It is easy to see that the norm A W p→q is finite in this case and A W is just an operator-theoretic viewpoint on the L p -graphon W . Again, compactness results hold for the case of · p→q so that we get the existence of a limiting graphon/P-operator.
There are also special classes of P-operators, which are particularly relevant. One important class are graphops, which are positivity preserving and self-adjoint P-operators, which behave effectively like undirected graphs. This makes sense since in the finite-dimensional case, adjacency matrices of undirected graphs are graphops.
Once we have graphops, it is natural to ask, how these operators are going to appear in dynamical systems on networks. It is the main theme of this paper to understand their appearance in dynamics on a formal level (rigorous proofs seem to be out of reach in full generality at this point but it is expected that eventually such proofs will be available).

Microscopic Network Dynamics
We always denote the dynamical state of a vertex by u k (t) for k ∈ [n] and denote the current time by t ∈ [0, T ) with some fixed T > 0.

The Kuramoto Model
The classical Kuramoto model considers oscillators on a circle so u k (t) ∈ S 1 := R/(0 ∼ 2π). The dynamics is given by where ω k are given internal frequencies of the individual oscillators and p 0 ≥ 0 is a parameter controlling the coupling. The Kuramoto model on complex networks is usually written as follows [30] where p ≥ 0 is a parameter controlling the coupling, which may contain a certain scaling in n depending upon the type of graph defined by the adjacency matrix A = (A kj ) k,j∈ [n] . Of course, on a finite-dimensional level the interpretation of adjacency matrices as graphops on ([n], 2 [n] , µ [n] ), as discussed in Section 2, applies. So if we consider a matrix of phase differences this yields upon inserting it into the Kuramoto model where diag(M) is the vector obtained from the elements on the diagonal of the matrix M. Note that (V, AV ) contains the same information as where we used A = A ⊤ as our graphs are undirected. Hence, (V, AV ) can be viewed as an n-profile if we uniformly sample from it. Since the matrix of phase differences V does evolve in time, we see that the input to the dynamics of each oscillator is (in addition to its intrinsic frequency) driven by moving through a subset of the n-profiles. Even beyond identifying the driving, we can go further by setting ω = (ω 1 , . . . , ω n ), u = (u 1 , . . . , u n ), d = ((AV ) 11 , . . . , (AV ) nn ), to re-write our differential equations as u ′ = ω+pd, so right-multiplication and time-differentiation of the 1-profile (u, uA) gives (u, (uA)) = (u ′ , (uA) ′ ) = (ω + pd, ωA, +pdA).
Therefore, this defines an evolution equation for the pair (u, uA), which yields an evolution equation on the space of measures (T 1,A ) * µ also written by which are associated to a 1-profile. In summary, the dynamics of the Kuramoto model on finite graphs induces an evolution equation on measures, which is somewhat different from using the classical empirical measure 1 n n j=1 δ u j , which keeps only track of the positions. Indeed, the 1-profile also contains the action of A on u given by uA and thereby the relevant correlation structure embodied by the joint empirical distribution (aka the"JEDi") given by (12). The importance of this novel viewpoint of the dynamics is that once we are on a graph, then we have to intertwine structure and dynamics [20], thereby capturing how vertices respond to connectivity. We will see that this theme re-appears on other modelling scales and for VFPEs below.

The Cucker-Smale Model
The Cucker-Smale model [8] is one well-known model for swarming. One variant of it reads where u k = u k (t) is the position of agent k at time t and ψ is a given function regulating the type of communication between different agents. More generally, one may pose the model on a network via We setã kj := a kj ψ(|u k − u j |), and obtain by re-writing the model in vectorized form as a firstorder system and by right-multiplication with A, the following set of equations where Diag(M) denotes the matrix obtained from M by only keeping the entries on the diagonal and setting all other entries to zero. Of course, the equations for (uA, vA) still depend directly on values of u via the definition ofÃ. It is interesting to see that we effectively obtain now an evolution of measures via the 2-profile (u, v, uA, vA). The structure of the equations entails that the second-order nature of the model transfers to profiles, i.e., there is a constraint on the 2-profile in the same sense as for usual second-order ODEs.

Kinetic Models
Instead of particular models, one can also look at broader classes of interacting systems frequently employed in kinetic theory [6,15,32]. A form commonly found in the literature is: Evidently, one can extend this to a model with a complex network structure by writing In this general abstract form, if we set f (u j , u k ) =: f jk = f jk (u) we get with a matrix F (u) = (f jk (u)) that u ′ = diag(AF (u)) and (uA) ′ = diag(AF (u))A, as the evolution equation via the 1-profile (u, uA). Of course, this formulation is far too general to see any special structure regarding the evolution equation induced on the measures contained in profiles. If we linearize the evolution equations around a steady state u * we obtain So locally near steady states the evolution equation for the 1-profile are identical for both components, regardless of the precise underlying kinetic/particle model. This already hints at the fact, that profiles and the operator-theoretic viewpoint via graphops is well-suited for nonlinear dynamics and functional analysis techniques. This viewpoint will re-appear again for VFPEs below.

Mesoscopic Network Dynamics
Having observed that graphops and profiles can provide a very elegant re-interpretation as well as augmentation of finite-dimensional network dynamics models, we now proceed to formally take limits n → ∞ to obtain "mean-field" models. We consider the mesoscale, in the sense that we are no longer interested in the state of individual vertices on a graph but only in the probability distribution of the state of a vertex.

The Kuramoto Model
We have already seen in Section 3.1 that the classical Kuramoto model provides a good starting point. It is well-studied, and its homogeneous mean-field limiting equation is well-known. We recall its classical formal derivation [22,34] here because this derivation will be important for us to reflect back upon in the context of graphops below.

All-to-All Coupling
Starting from (9), suppose we have very large number of oscillators and their intrinsic frequencies are distributed according to a density g = g(ω) with g(ω) = g(−ω). One introduces the complex order parameter as Multiplying the order parameter by e iu k , and taking imaginary parts, one easily checks that now the Kuramoto model (9) can be re-written as In particular, the k-th oscillator feels all other oscillators via a single mean-field parameter. Next, let ρ(u, t, ω) du denote the fraction of oscillators with frequency ω between u and u + du at time t. Then by construction ρ is a probability density 2π 0 ρ(u, t, ω) du = 1, ρ ≥ 0.
In the limit n → ∞, we can formally re-write the order parameter (16) as Indeed, the last formula can be derived as a limit of the sum in (16) using the law of large numbers [34]. Yet, it is very crucial to note that this law of large numbers argument only gives us a mean-field as it produces only the mean of the order parameter completely discarding correlations. Furthermore, the approach implicitly assumed that each random variable has finite variance. Yet, if these assumptions hold then we know that the resulting Liouville/continuity equation for the probability density ρ for the Kuramoto model should be given by This equation can be re-written using the order parameter (18), multiplying by a suitable exponential and taking the imaginary part as The previous argumentation can be made fully rigorous to derive the mean-field Vlasov-type equation (19). Yet, the argument is highly non-trivial and the currently most elegant way of proof proceeds via a so-called Dobrushin-type bound [10,11], which is effectively a Gronwall estimate on a metric space of measures, usually carried out in a Wasserstein metric [15,21,27]. We remark that if one would consider the Kuramoto model with stochastic perturbations, then one is going to obtain a (nonlinear) Fokker-Planck equation [14], which appears in many classical stochastic coupled oscillator models, where the coupling appears through a mean field, such as the Desai-Zwanzig model [9]. Hence, one sometimes refers to (19) as a Vlasov-Fokker-Planck (VFPE) equation.

Complex Network Heuristics
Having dealt with the classical all-to-all coupled VFPE case, it is natural to think about extensions to treat the Kuramoto model (10) on complex networks. Recall that we have assumed that the underlying graph is undirected and unweighted so that the adjacency matrix A = (A ij ) i,j∈[n] is symmetric and binary A ij ∈ {0, 1} for all i, j ∈ [n]. There exists a known formal approach [30] to try to "save" the classical mean-field argument from Section 4.1.1. One defines a local order parameter as If the graph is sufficiently well-connected and effectively still behaves like an all-to-all coupled system, one is then tempted to make the ansatz of a single global field where κ k is the degree of vertex k, i.e., the global field is locally proportional to the local field weighted by the degree. With this assumption one obtains that (17) can now be written as u ′ k = ω k + p 0 rκ k sin(ψ − u k ). In particular, these steps motivate that one might want to use the local degree of a vertex as the next approximation criterion to derive a mean field [18]. Hence, one option is to consider a probability density ρ(u, t, ω, κ) of oscillators having phase u, frequency ω, and degree κ at time t with the usual conditions 2π 0 ρ(u, t, ω, κ) du = 1, ρ ≥ 0.
Evidently one cannot really do much with standard tools as just defining ρ still does not lead to a mean-field VFPE. If one assumes that the network is uncorrelated and has a degree distribution d(κ), then the probability that an edge has its end a vertex of phase u, degree κ and frequency ω at time t is κd(κ) κ g(ω)ρ(u, t, ω, κ) where κ is the average degree of a vertex in the graph. This suggests that it might be helpful to define the order parameter now as Now the same trick as previously, multiplying by an exponential and taking imaginary parts, yields VFPE equation ∂ t ρ = −∂ u (ρ(ω + pκr sin(ψ − u))) .
The right-hand side of the VFPE could again be expressed now as some integral. Indeed, one may formally write the equation for the evolution of the average phase in the limit n → ∞ as which arises as a formal replacement of the sum in the Kuramoto model (10). This suggests a closed VFPE in the form The last considerations already show an emerging theme that we shall exploit later on: there is an effective integral operator acting on the density ρ, which depends upon the graph structure, in the mean-field VFPE equation on complex networks. In comparison to the classical all-to-all coupled VPDE (19), we have replaced in (21) ρ(ũ, t,ω) by ∞ 0 d(κ)κ κ ρ(ũ, t,ω,κ) dκ.

Kuramoto on Graphons
Regarding the previous considerations in Section 4.1.2 on integral operators, it is now no surprise, how the VFPE equation on graphons should look [7]. Let W : Ω × Ω → R be a graphon as defined in (7) and recall that it defines an associated P-operator A W via (8). For graphons we know that we may identify Ω = [0, 1] with the unit interval. So for x ∈ Ω one now obtains the VFPE for a density ρ = ρ(u, t, ω, x) of oscillators having phase u, frequency ω, and graphon position x at time t. The position x ∈ [0, 1], where we evaluate a graphon has the natural interpretation in the graph limit as the position/index of a vertex, while W (x, y) provides a limiting version of the adjacency matrix, i.e., how x is connected to y. In comparison to the classical all-to-all coupled VPDE (19), we have replaced in (22) ρ(ũ, t,ω) by Now the generalization to far more general classes beyond graphons is becoming physically evident.

Kuramoto on Graphops
Let A : L p (Ω) → L q (Ω) be a graphop as defined in Section 2. Suppose it arises as a limit from a sequence of adjacency matrices where convergence is understood with respect to the metric d M . From the Kuramoto model on complex networks (10) one now formally obtains the VFPE for a density ρ = ρ(u, t, ω, x) of oscillators having phase u, frequency ω, and graphop coordinate x at time t. The position x ∈ Ω is now more abstract, yet it still has a meaning in the sense that it should present the position/index of a vertex. Instead of an integral operator representation, for a P-operator, we have to think of the action of a graph, which is actually the primary insight that recently arose in the graph theory literature as discussed in Section 2. In (23), we see, how this insight can be carried over to easily writing abstract VFPE mean-field type equations, even if the graph is neither all-to-all, nor uncorrelated, nor homogeneous, nor dense, nor a special sparse graph. In comparison to the classical all-to-all coupled VPDE (19), we have abstractly replaced in (23) ρ(ũ, t,ω) by (A (∞) ρ)(ũ, t,ω,x).
Note that this formulation is much cleaner, more general, and much easier to comprehend in comparison to other approaches.

Kinetic Models on Graphops
The approach in the previous section for the Kuramoto model is now relatively easy to formally generalize to more abstract kinetic models such as (15). One simply takes the normal VFPE for (14), which would read ∂ t ρ = −∂ u (ρV (ρ)), (24) where V is completely computable from f [15]. Then one lifts the action of the finite-dimensional adjacency matrices A (n) onto a limiting graphop A (∞) , and replaces the density in the vector field driven part V of the VFPE All the operator-theoretic properties of the finite-dimensional graphs A (n) , which persist in the limit A (∞) , actually can now be used to analyze the VFPE. We know from [1] that graphops do carry a lot of information via the graph limit for large classes of graphs, so we may strongly expect that an analysis of (25) can now proceed along classical lines of dynamical systems theory. For example, steady states ρ * will satisfy Linearization is possible via the normal Fredholm derivative in many cases. Suppose that we have an evolution equation in a suitable Banach space X where the P-operator part is such that it defines a well-defined evolution in X . Then the linearization is just formally given by the chain rule and product rule ∂ t Ξ = −∂ u (ΞV (A (∞) ρ * ) + ρ * D ρ V (A (∞) Ξ)) =: L ∞ Ξ for Ξ = Ξ(t) ∈ X . The linear operator L ∞ can now be analyzed using classical tools from functional analysis such as spectral theory. As in Section 3.3 on the finite-dimensional level, the operator L ∞ carries in an intertwined way the information about • the shape of the steady state via ρ ∞ , • the linearization of the coupling function f via V ′ , • the graph spectral information via A (∞) .
Hence, the graphop viewpoint provides in a completely natural way the opportunity to lift spectral information from the finite-dimensional setting to the graph limit in the context of stability analysis.

Outlook
We have shown that the viewpoint of a more abstract operator-theoretic approach is extremely elegant to provide a general theory for network dynamics, when we do not have a simple coupling structure. This applies to the finite-dimensional microscopic case as well as the limiting case for VFPEs.
Although we have physically derived, by using analogies to previous approaches in the meanfield context for homogeneous/all-to-all coupling, a suitable VFPE formulation on a limiting graphop, several questions remain. From a kinetic theory perspective, we would like to prove that the limiting VFPE indeed approximates over a finite time scale, the underlying finite-dimensional network dynamics for large n. We conjecture that similar proofs 1 via a Dobrushin-type estimate can also work in the graphop context, when the structure of the graph limit theory is taken into account. From the dynamics perspective, it would be very interesting to study spectral theory for operators involving graphops as one element of their definition in more detail as we have indicated in the last section. Furthermore, from the physical perspective it is important to gain a better understanding of graph limit procedures in the sense of particle interactions. We believe that the finite-dimensional observations we provided hint at a correlation structure viewpoint being particularly important to provide the correct physical interpretation. From the viewpoint of direct applications in various sciences, it seems reasonable to assume that testing examples and analyzing finite-size effects are going to be of particular practical importance.