Directed network Laplacians and random graph models

We consider spectral methods that uncover hidden structures in directed networks. We establish and exploit connections between node reordering via (a) minimizing an objective function and (b) maximizing the likelihood of a random graph model. We focus on two existing spectral approaches that build and analyse Laplacian-style matrices via the minimization of frustration and trophic incoherence. These algorithms aim to reveal directed periodic and linear hierarchies, respectively. We show that reordering nodes using the two algorithms, or mapping them onto a specified lattice, is associated with new classes of directed random graph models. Using this random graph setting, we are able to compare the two algorithms on a given network and quantify which structure is more likely to be present. We illustrate the approach on synthetic and real networks, and discuss practical implementation issues.


Motivation
Uncovering structure by clustering or reordering nodes is an important and widely studied topic in network science [1,2].The issue is especially challenging if we move from undirected to directed networks, because there is a greater variety of possible structures.For example, even a simple motif of three connected nodes has 13 distinct forms [3, Figure 1(a)].Moreover, when spectral methods are employed, directed edges lead to asymmetric eigenproblems [4,5,6,7].Our objective in this work is to study spectral (Laplacian-based) methods for directed networks that aim to reveal clustered, directed, hierarchical structure; that is, groups of nodes that are related because, when visualized appropriately, one group is seen to have links that are directed towards the next  group.This hierarchy may be periodic or linear, depending on whether there are well-defined start and end groups.Figures 1a and 1b illustrate the two cases.Mapping a network to a linear structure may help us understand the upstreamness and downstreamness of nodes, which is useful, for example, in the study of cascading effects such as social or financial contagion [8].Similarly, periodic hierarchies have been associated with sustainability and risk management issues in commerce [9], and also with the existence of echo chambers in online social media [10].
Of course, on real data these structures may not be so pronounced; hence in addition to visualizing the reordered network, we are interested in quantifying the relative strength of each type of signal.Laplacian-based methods are often motivated from the viewpoint of optimizing an objective function.This work focuses on two such methods.Minimizing frustration leads to the Magnetic Laplacian which may be used to reveal periodic hierarchy [11,5].Minimizing trophic incoherence leads to what we call the Trophic Laplacian, which may be used to reveal linear hierarchy [6].We will exploit the idea of associating a spectral method with a generative random graph model.This in turn allows us to compare the outputs from spectral methods based on the likelihood of the associated random graph.This connection was proposed in [12] to show that the standard spectral method for undirected networks is equivalent to maximum likelihood optimization assuming a class of range-dependent random graphs (RDRGs) introduced in [13].The idea was further pursued in [14], where a likelihood ratio test was developed to determine whether a network with RDRG structure is more linear or periodic.
The main contributions of this work are as follows.
• We propose two new directed random graphs models.One model has the unusual property that the probability of an i → j connection is not independent of the probability of the reciprocated j → i connection.
• We establish connections between these random graph models and algorithms from [11] and [6] that use the Magnetic Laplacian and Trophic Laplacian, respectively, by showing that reordering nodes or mapping them onto a specific lattice structure using these algorithms is equivalent to maximizing the likelihood that the network is generated by the models proposed.
• We show that by calibrating a given network to both models, it is possible to quantify the relative presence of periodic and linear hierarchical structures using a likelihood ratio.
• We illustrate the approach on synthetic and real networks.
The rest of the manuscript is organised as follows.In the next section, we introduce the Magnetic and Trophic Laplacian algorithms.Section 3 defines the new classes of random directed graphs and establishes their connection to these spectral methods.Illustrative numerical results on synthetic networks are given in Section 4, and in Section 5 we show results on real networks from a range of applications areas.We finish with a brief discussion in Section 6.

Notation
We consider an unweighted directed graph G = (V, E) with node set V and edge set E, with no self-loops.The adjacency matrix A is n × n with A ij = 1 if the edge i → j is in E, and A ij = 0 otherwise.It is convenient to define the symmetrized adjacency matrix W (s) = (A + A T )/2.The symmetrized degree matrix D is diagonal with D ii = d i , where ij is the average of the in-degree and out-degree of node i.Later, we will consider weighted networks for which each edge i → j has associated with it a non-negative weight w ij .In this case, we let A ij = w ij .We use i to denote √ −1, and we write x H to denote the conjugate transpose of a vector x ∈ C n .We use P to denote the set of all permutation vectors, that is, all vectors in R n with distinct components given by the integers 1, 2, . . ., n.

Spectral Methods for Directed Networks
Spectral methods explore properties of graphs through the eigenvalues and eigenvectors of associated matrices [15,16,1,2].In the undirected case, the standard graph Laplacian L = D − A is widely-used for clustering and reordering, along with normalized variants.The directed case has received less attention; however, several extensions of the standard Laplacian have been proposed [7].We focus on two spectral methods for directed networks, which are discussed in the next two subsections: the Magnetic Laplacian algorithm, which reveals periodic flow structures [11,5], and the Trophic Laplacian algorithm, which reveals linear hierarchical structures [6].We choose to study these two algorithms because they have an optimization formulation and, as we show in section 3, may be interpreted in terms of random graph models.Here we briefly mention two other related techniques that do not fit naturally into this framework.The Hermitian matrix method groups nodes into clusters with a strong imbalance of flow between clusters [4].This approach constructs a skew-symmetric matrix that emphasizes net flow between pairs of nodes but ignores reciprocal edges.A spectral clustering algorithm motivated by random walks was derived in [17] leading to a graph Laplacian for directed networks that was proposed earlier in [18].

The Magnetic Laplacian
Given a network and a vector of angles θ = (θ 1 , θ 2 , ..., θ n ) T in [0, 2π), we may define the corresponding frustration where δ ij = −2πgα ij with g ∈ [0, 1  2 ].Here α ij = 0 if the edge between i and j is reciprocated, that is For convenience we also set α ij = 0 if i and j are not connected.To understand the definition (1), suppose that for a given graph we wish to choose angles that produce low frustration.Each term W (s) ij |e iθi − e iδij e iθj | 2 in (1) can make a positive contribution to the frustration if W (s) ij = 0; that is, if i and j are involved in at least one edge.In this case, if there is an edge from i to j that is not reciprocated, then we can force this term to be zero by choosing θ j = θ i + 2πg.If the edge is reciprocated, then we can force the term to be zero by choosing θ j = θ i .Hence, intuitively, choosing angles to minimize the frustration can be viewed as mapping the nodes into directed clusters on the unit circle in such a way that (a) nodes in the same cluster tend to have reciprocated connections, and (b) unreciprocated edges tend to point from source nodes in one cluster to target nodes in the next cluster, periodically.Setting the parameter g = 1/k for some positive integer k indicates that we are looking for k directed clusters.
On a real network it is unlikely that the frustration (1) can be reduced to zero, but it is of interest to find a set of angles that give a minimum value.This minimization problem is closely related to the angular synchronization problem [19,20], which estimates angles from noisy measurements of their phase differences θ i − θ j mod 2π.Moreover, we note that for visualization purposes it makes sense to reorder the rows and columns of the adjacency matrix based on the set of angles that minimizes the frustration.We also note that in [11] the expression in (1) for the frustration is normalized through a division by 2 i d i .This is immaterial for our purposes, since that denominator is independent of the choice of θ.
The frustration (1) is connected to the Magnetic Laplacian, which is defined as follows, where A• B denotes the elementwise, or Hadamard, product between matrices of the same dimension; that is, (A Definition 2.1.Given g ∈ [0, 1  2 ], the Magnetic Laplacian L (g) [11,5] is defined as where Here, the transporter matrix T (g) assigns a rotation to each edge according to its direction.
It is straightforward to show that L (g) is a Hermitian matrix.When g = 0 and the graph is undirected, the Magnetic Laplacian reduces to the standard graph Laplacian.
The following result, which is implicit in [11,5], shows that the frustration (1) may be written as a quadratic form involving the Magnetic Laplacian.
Theorem 2.1.Let ψ ∈ C n be such that ψ j = e iθj , then Appealing to the Rayleigh-Ritz theorem [21] the quadratic form on the left hand side of (2) is minimized over all ψ ∈ C n with ψ 2 = 1 by taking ψ to be an eigenvector corresponding to the smallest eigenvalue of the Magnetic Laplacian.Now, such an eigenvector will not generally be proportional to a vector with components of the form {e iθj } n j=1 .However, a useful heuristic is to force this relationship in a componentwise sense; that is, to assign to each θ j the phase angle of ψ j , effectively solving a relaxed version of the desired minimization problem.This leads to Algorithm 1 below, as used in [11].

The Trophic Laplacian
The idea of discovering a linear directed hierarchy arises in many contexts where edges represent dominance or approval, including the ranking of sports teams [22] and web pages [23].A particularly well-defined case is the quantification of trophic levels in food webs, where each directed edge represents a consumerresource relationship [24,25,26].We focus here on the approach in [6], where the aim is to assign a trophic level h i to each node i such that along any directed edge the trophic level increases by one.This motivates the minimization of the trophic incoherence Denoting the total weight of node i as ω i = j∈V (A ji + A ij ) and the imbalance as χ i = j∈V (A ji − A ij ), the trophic level vector h ∈ R n that minimizes the trophic incoherence solves the linear system of equations where Λ = diag(ω) − A − A T , and the solution to (4) is unique up to a constant shift [6].Since it employs a Laplacian-style matrix, Λ, we refer to it as the Trophic Laplacian algorithm; see Algorithm 2.
Algorithm 2: Trophic Laplacian algorithm Result: The trophic levels h Input adjacency matrix A; Calculate the node weights ω i = j A ji + j A ij ; Calculate the node imbalances χ i = j A ji − j A ij ; Calculate the Trophic Laplacian Λ = diag(ω) − A − A T ; Solve the linear system (4); Reorder or visualize nodes using h

Random Graph Interpretation
In this section, we associate two new random graph models with the Magnetic and Trophic Laplacian algorithms, using a similar approach to the work in [12].After establishing these connections, we proceed as in [14] and propose a maximum likelihood test to compare the two models on a given network.

The Directed pRDRG Model
Given a set of phase angles {θ i } n i=1 , we will define a model for unweighted, directed random graphs.The model generates connections between each pair of distinct nodes i and j with four possible outcomes-a pair of reciprocated edges, an unreciprocated edge from i to j, an unreciprocated edge from j to i, or no edges-as follows where f , q and l are functions that define the model, and, of course, they must be chosen such that all probabilities lie between zero and one.We emphasize that this model has a feature that distinguishes it from typical random graph models, including directed Erdős-Rényi and small-world style versions [27]: the probability of the edge i → j is not independent of the probability the edge j → i, in general.We are interested here in the inverse problem where we are given a graph and a model ( 5)-( 8), and we wish to infer the phase angles.This task arises naturally when the nodes are supplied in some arbitrary order.We will assume that the phase angles are to be assigned values from a discrete set {ν i } n i=1 ; that is, we must set θ i = ν pi , where p is a permutation vector.This setting includes the cases of (directed) clustering and reordering.For example, with n = 12, we could specify and ν 10 = ν 11 = ν 12 = 3π/2, in order to assign the nodes to four directed clusters of equal size.Alternatively, ν i = (i − 1)2π/12 would assign the nodes to equally-spaced phase angles, as shown in Figure 2a, as a means to reorder the graph.The following theorem shows that solving this type of inverse problem for suitable f , q and l is equivalent to minimizing the frustration.Theorem 3.1.Suppose θ ∈ R n is constrained to take values such that θ i = ν pi , where p is a permutation vector.Then minimizing the frustration η(θ) in ( 1) over all such θ is equivalent to maximizing the likelihood that the graph came from a model of the form ( 5)-( 8) in the case where with β ij = θ i − θ j and normalization constant for any positive constant γ.
Proof.We first note that, since ji for i = j, and W ii = 0, we may express η(θ) (1) in terms of a sum over ordered pairs: Then, distinguishing between the three different ways in which each i and j may be connected, we have The likelihood L of the graph G from a model of the form ( 5)-( 8) is given by which we may rewrite as The final factor on the right hand side, which is the probability of the null graph, takes the same value for any θ ∈ R n such that θ i = ν pi , since each ordered pair of arguments appears exactly once.We may therefore ignore this factor when maximizing the likelihood.Then, taking the logarithm and negating, we see that maximizing the likelihood is equivalent to minimizing the expression Comparing terms in ( 12)-( 14) and ( 10)- (11) we see that the two minimization problems are equivalent if where we may choose any positive constant γ since the minimization problems are scale invariant.Solving for f , q and l as functions of θ i and θ j we arrive at the model in the statement of the theorem.
For the model in Theorem 3.1, the probability of an edge from node i to node j depends on the phase difference β ij = θ i − θ j , the decay rate γ, and the parameter g.We see that γ determines how rapidly the edge probability varies with the phase difference.In the extreme case when γ = 0, we obtain f (θ i , θ j ) = q(θ i , θ j ) = l(θ i , θ j ) = 1/4, and thus the model reduces to a conditional Erdős-Rényi form.In addition, as γ increases the graph generally becomes more sparse.This is because the likelihood of disconnection, exp[2γ(1 − cos(θ i − θ j ))]/Z ij , is greater than or equal to that of the other cases.
We note that having applied the Magnetic Laplacian algorithm to estimate θ, there are two straightforward approaches to estimating γ.One way is to maximize the graph likelihood over γ > 0. Another is to choose γ so that the expected edge density from the random graph model matches the edge density of the given network.We illustrate these approaches in Section 4.
Remark 3.1.Since the edge probabilities are functions of the phase differences and have a periodicity of 2π, this model resembles the periodic Range-Dependent Random Graph (pRDRG) model in [14], which generates an undirected edge between i and j with probability f (min{|j − i|, n − |j − i|}) for a given decay function f .We will therefore use the term directed periodic Range-Dependent Random Graph model (directed pRDRG) to describe the model in Theorem 3.1.

The Trophic Range-dependent Model
Now, given a set of trophic levels {h i } n i=1 , we define an unweighted, directed random graph model where for some function f .Here, the probability of an edge i → j is independent of the probability of the edge j → i.
Following our treatment of the directed pRDRG case, we are now interested in the inverse problem where we are given a graph and the model ( 15)-( 16), and we wish to infer the trophic levels.We will assume that the trophic levels are to be assigned values from a discrete set {ν i } n i=1 ; that is, we must set h i = ν pi , where p is a permutation vector.This setting includes the cases of assignment of nodes to trophic levels of specified size; for example, with n = 12, we could set and ν 10 = ν 11 = ν 12 = 4, in order to assign the nodes to four equal levels.Alternatively, ν i = i would assign each node to its own level, which is equivalent to reordering the nodes.The following theorem shows that solving this type of inverse problem for suitable f is equivalent to minimizing the trophic incoherence.Theorem 3.2.Suppose h ∈ R n is constrained to take values such that h i = ν pi , where p is a permutation vector.Then minimizing the trophic incoherence F (h) in ( 3) over all such h is equivalent to maximizing the likelihood that the graph came from a model of the form (15)-( 16) in the case where for any positive γ.
Proof.Noting that the denominator in (3) is independent of the choice of h, this result is a special case of Theorem 3.4 below, with For the model in Theorem 3.2, the probability of an edge i → j is a function of the shifted, directed, squared difference in levels, (h j − h i − 1) 2 .The larger this value, the lower the probability.Within the same level, where h i = h j , the probability is 1/(1 + e γ ).The edge probability takes its maximum value of 1/2 when h j − h i = 1, that is, when the edge starts at one level and finishes at the next highest level.We also see that the overall expected edge density is always smaller than 1/2.Across different levels, where h i = h j , the edge i → j and the edge j → i are not generated with the same probability.If |h j − h i − 1| < |h i − h j − 1|, the edge i → j is more likely than j → i.The two edge probabilities are equal if and only if h i = h j .Therefore, this model could be interpreted as a combination of an Erdős-Rényi model within the same level and a periodic range-dependent model across different levels.
The parameter γ controls the decay rate of the likelihood as the shifted, directed, squared difference in levels increases.When h j − h i = 1, γ plays no role.If γ = 0, the model reduces to Erdős-Rényi with an edge probability of 1/2.As γ → ∞, the edge probability tends to zero if h j − h i = 1.In this case, the model will generate a multipartite graph where edges are only possible in one direction between adjacent levels, and this happens with probability 1/2.As mentioned previously in subsection 33.1 and illustrated in Section 4, γ can be fitted from a maximum likelihood estimate or by matching the edge density.
We note that the definition of trophic incoherence in (3) and the resulting Trophic Laplacian algorithm make sense for a non-negatively weighted graph, in which case we have the following result.Here, to be concrete we assume that weights lie strictly between zero and one.Similar results can be obtained for weights from a discrete distribution.
Theorem 3.3.Suppose h ∈ R n is constrained to take values such that h i = ν pi , where p is a permutation vector.Then minimizing the trophic incoherence F (h) in ( 3) over all such h for a weighted graph with weights in (0, 1) is equivalent to maximizing the likelihood that the graph came from a model where each edge weight A ij is independent with density function for any positive γ, where is a normalization factor.
Proof.This is a special case of Theorem 3.5 below, where

Generalised Random Graph Model
The results in subsections 33.1 and 3.2 exploit the form of the objective function: the sum over all edges of a kernel function can be viewed as the sum of log-likelihoods.This shows that the minimization problem is equivalent to maximizing the likelihood of an associated random graph model, in the setting where we assign nodes to a discrete set of scalar values.The restriction to discrete values is used in the proofs to make the probability of the null graph constant.However, we emphasize that in practice the relaxed version of the optimization problems, which are solved by the two algorithms, do not have this restriction.
The Magnetic Laplacian algorithm produces real-valued phase angles and the Trophic Laplacian algorithm produces real-valued trophic levels.
We may extend the connection in Theorem 3.2 to the case of higher dimensional node attributes, that is, where we wish to associate each node with a discrete vector from a set {ν [k] } n k=1 , where each ν [k] ∈ R d for some d ≥ 1.This setting arises, for example, if we wish to visualize the network in higher dimension; a natural extension of the ring structure would be to place nodes at regularly spaced points on the surface of the unit sphere, see Figure 2b, which we produced with the algorithm in [28].The next result generalizes Theorem 3.2 to this case.Theorem 3.4.Suppose we have an unweighted directed graph with adjacency matrix A and a kernel function I : R d × R d → R + , and suppose that we are free to assign elements {h [k] } n k=1 to values from the set {ν [k] } n k=1 ; that is, we allow where p is a permutation vector.Then minimizing i,j over all such {h [k] } n k=1 is equivalent to maximizing the likelihood that the graph came from a model where the (independent) probability of the edge i → j is for any positive γ.
Proof.Given {h [k] } n k=1 , the probability of generating a graph G from the model stated in the theorem is The second factor on the right hand side, the probability of the null graph, does not depend on the choice of {h [k] } n k=1 .So we may ignore this factor, and after taking logs and negating we arrive at the equivalent problem of minimizing Comparing ( 20) and ( 18), we see that two minimization problems have the same solution when for any positive γ, and the result follows.
For the model in Theorem 3.4, given {h [k] } n k=1 the edge i → j appears according to a Bernoulli distribution with probability f (h [i] , h [j] ), and hence with variance When I(h [i] , h [j] ) = 0 the probability is 1/2 and the variance takes its largest value, 1/4.The edge probability is symmetric about i and j if and only if the function I is symmetric about its arguments.In the case of squared Euclidean distance, I(h , and an undirected graph, the relaxed version of the minimization problem is solved by taking d eigenvectors corresponding to the smallest eigenvalues of the standard graph Laplacian.
For completeness, we now state and prove a weighted analogue of Theorem 3.4 assuming that weights lie strictly between zero and one.Discrete-valued weights may be dealt with similarly.Theorem 3.5.Suppose {h [k] } n k=1 may take values from the given set {ν where p is a permutation vector.Then, given a weighted graph with weights in (0, 1), minimizing the expression ( 18) over all such {h [k] } n k=1 is equivalent to maximizing the likelihood that the graph came from a model where A ij has (independent) density , for x ∈ (0, 1), and f (x) = 0 otherwise, for any positive γ, where is a normalization factor.
Proof.It is straightforward to check that the normalization factor Z ij ensures Now the product over all pairs i,j Z ij is independent of the choice of permutation vector p.Hence, under the model defined in the theorem, maximizing the likelihood of the graph G is equivalent to maximizing i,j f ij (A ij ).After taking logarithms and negating, we see that the choice (21) allows us to match (18).
Remark 3.2.It is natural to ask whether the frustration (1) fits into the form (18), and hence has an associated random graph model of the form (19).We see from ( 9) that the frustration may be written However, the factor |e iθi − e iδij e iθj | 2 depends (through δ ij ) on A ij , and hence we do not have expression of the form (18).This explains why a new type of model, with conditional dependence between the i → j and j → i connections, was needed for Theorem 3.1.

Model Comparison
The random graph models appearing in Section 3 capture the characteristics of linear and periodic directed hierarchies.Hence it may be of interest (a) to analyse properties of these models and (b) to use these models to evaluate the performance of computational algorithms.However, in the remainder of this work we focus on a follow-on topic of more direct practical significance.The Magnetic Laplacian and Trophic Laplacian algorithms allow us to compute node attributes θ and h in R n for a given graph, leading to unsupervised node ordering.The main computation required in this step is finding dominant eigenvector-eigenvalue pairs.Assuming that the network is sparse (each node has an O(1) degree) and that the power method gives the required accuracy in a finite number of iterations, this is an O(n) computation.Motivated by Theorems 3.1 and 3.2, we may then compute the likelihood of the graph for this choice of attributes, which has a complexity of O(n 2 ).By comparing likelihoods we may quantify which underlying structure is best supported by the data.An extra consideration is that both random graph models involve a free parameter, γ > 0, which is needed to evaluate the likelihood.As discussed earlier, one option is to fit γ to the data, for example by matching the expected edge density from the model with the edge density of the given graph.However, based on our computational tests, we found that a more reliable approach was to choose the γ that maximizes the likelihood, once the node attributes were available; see Sections

Results on Synthetic Networks
In this section, we demonstrate the model comparison workflow on synthetic networks.These networks are generated using the directed pRDRG model and the trophic RDRG model.Hence, we have a "ground truth" concerning whether a network is more linear or periodic.Note that the Magnetic Laplacian algorithm and associated random graph model have a parameter g that controls the spacing between clusters.Therefore, when using the Magnetic Laplacian algorithm our first step is to select the parameter g based on the maximum likelihood of the graph.

Directed pRDRG Model
We generate a synthetic network using the directed pRDRG model with K clusters of size m, and hence n = mK nodes.An array of angles θ ∈ R n is created, forming evenly spaced clusters C 1 , C 2 , ..., C K .This is achieved by letting θ i = 2π(l−1) K + σ if i ∈ C l , where σ ∼ unif(−a, a) is added noise.We then construct the adjacency matrix according to the probabilities in Theorem 3.1 with g = 1/K.We choose m = 100, K = 5, γ = 5 and a = 0.2 and the corresponding adjacency matrix is shown in Figure 3a.
The Magnetic Laplacian algorithm is then applied to the adjacency matrix to estimate phase angles and reorder the nodes.The reordered adjacency matrix (Figure 3b) recovers the original structure.The Trophic Laplacian algorithm is also applied to estimate the trophic level of each node.Figure 3c shows the adjacency matrix reordered by the estimated trophic levels, which hides the original pattern.Intuitively, the Trophic Laplacian algorithm is unable to distinguish between these nodes since there is no clear "lowest" or "highest" level among the directed clusters.
Figure 3d illustrates how the optimal parameter g is selected.The plots show the likelihood that the network is generated by a directed pRDRG model for g = 1 2 , 1 3 , 1 4 , 1 5 , 1 6 , assuming we are interested in structures with at most 6 directed clusters.We see that g = 1  5 has the highest maximum likelihood, as expected.Consequently, we choose g = 1/5 for the Magnetic Laplacian algorithm.In addition for this value of g we plot in Figure 3e the phase angles estimated with the Magnetic Laplacian algorithm against the true phase angles.The linear relationship confirms that the algorithm recovers the 5 clusters in the presence of noise.
We finally in Figure 3f compare the likelihood of a directed pRDRG against the likelihood of a trophic RDRG.Both likelihoods are calculated using several test points for γ.The highest points are highlighted with circles and they correspond to the maximum likelihood estimators (MLE) for γ.Not surprisingly, in this case the Magnetic Laplacian algorithm achieves a higher maximum.Asterisks highlight the point estimates arising when the expected number of edges is matched to the actual number of edges.We see here, and also observed in similar experiments, that the maximum likelihood estimate for γ produces a more accurate result.We also found (numerical experiments not presented here) that

The Trophic RDRG model
Following on from the previous subsection, we now generate synthetic data by simulating the trophic RDRG model with levels C 1 , C 2 , . . ., C K , where each level has m nodes.In particular, we generate an array of trophic indices h ∈ R n , where the total number of nodes is n = mK.We let h i = l + σ if i ∈ C l for 1 ≤ l ≤ K, where σ ∼ unif(−a, a) is added noise.The edges are then generated according to the probabilities in Theorem 3.2.In the following example we use K = 5, m = 100, a = 0.2 and γ = 5.This generates a network with 5 clusters forming a linear directed flow, as shown in Figure 4a.
We see in Figure 4c that the Trophic Laplacian algorithm recovers the underlying pattern.Figure 4b shows that the Magnetic Laplacian algorithm also gives adjacent locations to nodes in the same cluster, and places the clusters in order, modulo a "wrap-around" effect that arises due to its periodic nature.Figure 4d suggests that the optimal Magnetic Laplacian parameter is g = 1/6.For this case, it is reasonable that g = 1/K is not identified, since the disconnection between the first and the last cluster contradicts the structure of the directed pRDRG model.
The trophic levels estimated using the Trophic Laplacian are consistent with   4e.As expected, the Trophic Laplacian produces a higher maximum likelihood for this network (Figure 4f) and a more accurate MLE and point estimate for γ.We observe (in similar experiments not presented here) that when using the Trophic Laplacian, the accuracy of both estimates increase using the Trophic Laplacian.

Results on Real Networks
We now discuss practical use cases for the model comparison tool on a range of real networks.We emphasize that the tool is not designed to discover whether a given directed network has linear or directed hierarchical structure; rather it aims to quantify which of the two structures is best supported by the data in a relative sense.Since both models under investigation assume no self-loops, we discard these if they are present in the data.Following common practice, we also preprocess by retaining the largest strongly connected component to emphasize directed cycles.This ensures that any pair of nodes can be connected through a sequence of directed edges.However, when the strongly connected component contains too few nodes, we analyze the largest weakly connected component instead.
We give details on four networks, covering examples of the two cases where linear and periodic structure dominates.For the first two networks, we show network visualizations to illustrate the results further.In subsection 5.5 we present summary results over 15 networks.

Food Web
In the Florida Bay food web1 [29], nodes are components of the system, and unweighted directed edges represent carbon transfer from the source nodes to the target nodes [30], which usually means that the latter feed on the former.Besides organisms, the nodes also contain non-living components, such as carbon dissolved in the water column.Since we are more interested in the relationship between organisms, we remove those non-living components from the network.We analyze the largest strongly connected component of the network, which comprises 12 nodes and 28 edges.
We estimate the phase angles of each node using the Magnetic Laplacian algorithm based on the optimal choice g = 1/3 (Figure 5a). Figure 5b compares the likelihood of the food web being generated by the directed pRDRG model with the likelihood of it being generated by the trophic RDRG model, as γ varies.The directed pRDRG model achieves a higher maximum likelihood, suggesting that the structure is more periodic than linear.In Figure 5c, the heights of the nodes correspond to their estimated trophic levels on a vertical axis.We see that 22 edges point upwards, these are shown in blue.There are 6 downward edges, highlighted in red, which violate the trophic structure.The Magnetic Laplacian mapping in Figure 5d arranges 26 edges in a counterclockwise direction, shown in blue, with 2 edges, shown in red, violating the structure and pointing in the reverse orientation.
With g = 1/3, the Magnetic Laplacian mapping is encouraging cycles in the food chain, and these are visible in Figure 5d, notably between members of three categories: (i) flatfish and other demersal fishes; (ii) lizardfish and eels; and (iii) toadfish and brotalus.Another noticeable distinction is that the Magnetic Laplacian mapping positions eels close to lizardfish, and flatfish near other demersal fishes by accounting for the reciprocal edges, while the Trophic Laplacian mapping places them further apart.In Figures 5e and 5f we show the reordered adjacency matrix arising from the two algorithms.

Influence Matrix
The influence matrix we study quantifies the influence of selected system factors in the Motueka Catchment of New Zealand [31].The original influence matrix consists of integer scores between 0 and 5, measuring to what extent the row factors influence the column factors, where a bigger value represents a stronger impact.The system factors and influence scores were developed by pooling the views of local residents.To convert to an unweighted network, we binarise the weights by keeping only the edges between each factor and the factor(s) it influences most strongly.We then select the largest strongly connected component, which comprises 14 nodes and 35 edges.The optimal parameter for the Magnetic Laplacian is g = 1/4 (Figure 6a).The mapping from the Magnetic Laplacian has a higher maximum likelihood than the Trophic Laplacian mapping, indicating a more periodic structure (Figure 6b).The Trophic Laplacian mapping in Figure 6c aims to reveal a hierarchical influence structure.Here, scientific research and economic inputs are assigned lower trophic levels, suggesting that they are the fundamental influencers.The labour market is placed at the top, indicating that it tends to be influenced by other factors.However, there are 8 edges, highlighted in red, that point downwards, violating the directed linear structure.
On the other hand, the Magnetic Laplacian mapping in Figure 6d aims to reveal four directed clusters with phase angles of approximately 0, π/2, π, 3π/2.We highlight the nodes corresponding to ecological factors in red and socialeconomic factors in blue.The cluster near π/2 with 6 nodes contains a combination of ecological and social-economic factors, and includes 6 reciprocal edges between ecological factors and social-economic factors.Adjacency matrix reorderings are shown in Figures 6e and 6f.Overall, the pattern agrees with the conceptual schematic model proposed in [31, Figure 5], which we have reproduced in Figure 7.This model posits that ecological factors exert influence on social-economic factors, which in turn influence on ecological factors, while the ecological system also influences itself.

Yeast Transcriptional Regulation Network
We now analyze a gene transcriptional regulation network2 [29] for a type of yeast called S. cerevisiae [32], where a node represents an operon made up of a group of genes in mRNA.An edge from operon i to j indicates that the transcriptional factor encoded by j regulates i.The original network is directed and signed, with signs indicating activation and deactivation.Here we ignore the signs and only consider the connectivity pattern.Since the largest strongly connected component has very few nodes, we take the largest weakly connected component, which comprises 664 nodes and 1078 edges.This is a very sparse network and consequently the log-likelihood of the directed pRDRG (Figure 8a) keeps increasing as a function of the decay rate parameter γ in the range we tested.We select g = 1/3 as the optimal parameter for the Magnetic Laplacian, and compare the log-likelihood of two models in Figure 8b.This time the trophic version achieves a higher maximum likelihood, favouring a linear structure.

C. elegans Frontal Neural Network
C. elegans is the only organism whose neural network has been fully mapped.The neural network of C. elegans3 [29] is unweighted and directed, representing connections between neurons and synapses [33].We investigate its largest strongly connected component with 109 nodes and 637 edges.The optimal value    for the parameter g among the test points is g = 1/5 (Figure 9a).The Trophic Laplacian algorithm achieves a higher maximum likelihood than the Magnetic Laplacian algorithm using (Figure 9b).This preference for a linear directed structure is consistent with the tube-like shape of the organism [34].

Other Real Networks
A summary of further real-world network comparisons is given in Table 1.In the Data set column, we use (s) and (w) to indicate whether the largest strongly or weakly connected component is analysed, respectively.The fourth column specifies the optimal parameter g for the Magnetic Laplacian determined through grid search among the test points g = 1/2, 1/3, 1/4, 1/5, 1/6.The decay parameter γ used for the grid search ranges from 0 to 20 with a step size of 0.5.The last column shows the logarithm of the ratio between the maximum likelihoods of the directed pRDRG and trophic models.Hence, periodic/linear structure is seen to be favoured for the networks in the first 8 rows/last 7 rows.

Discussion
Spectral methods can be used to extract structures from directed networks, allowing us to detect clusters, rank nodes, and visualize patterns.This work exploited a natural connection between spectral methods for directed networks and generative random graph models.We showed that the Magnetic Laplacian and Tropic Laplacian can each be associated with a range-dependent random graph.In the Magnetic Laplacian case, the new random graph model has the interesting property that the probabilities of i → j and j → i connections are not independent.Our theoretical analysis provided a workflow for quantifying the relative strength of periodic versus linear directed hierarchy, using a likelihood ratio, adding value to the standard approach of visualizing a new graph layout or reordering the adjacency matrix.We demonstrated the model comparison workflow on synthetic networks, and also showed examples where real networks were categorized as more linear or periodic.The results illustrate the potential for the approach to reveal interesting patterns in networks from ecology, biology, social sciences and other related fields.
There are several promising directions for related future work.It would be of interest to use the likelihood ratios to compare this network feature across a well-defined category in order to address questions such as "are results between top chess players more or less periodic than results between top tennis players?" and "does an organism that is more advanced in an evolutionary sense have more periodic connectivity in the brain?"An extension of the comparison tool to weighted networks should also be possible; here there are notable, and perhaps application-specific, issues about how to generalize and interpret the Magnetic Laplacian.Also, the comparison could be extended to include other types of structure, including stochastic block and core-periphery versions [39].This introduces further challenges of (a) accounting for different numbers of model parameters, and (b) dealing with nonlinear spectral methods.Further, by introducing an appropriate null model it may be possible to quantify the presence of linear or periodic hierarchies in absolute, rather than relative, terms.

Figure 1 :
Figure 1: Directed networks with (a) periodic hierarchy (edges point from nodes in one cluster to nodes in the next cluster, counterclockwise) and (b) linear hierarchy (edges point from nodes in one level to nodes in the next highest level).Node colours indicate the three clusters.

Figure 2 :
Figure 2: (a) Points uniformly distributed on the unit circle and (b) a sphere.

Figure 3 :
Figure 3: Magnetic Laplacian and Trophic Laplacian algorithms applied to a synthetic directed pRDRG

Figure 4 :
Figure 4: Magnetic Laplacian and Trophic Laplacian algorithms applied to a synthetic trophic RDRG

Figure 5 :
Figure 5: Results for the Florida Bay food web

Figure 6 :Figure 7 :
Figure 6: Results for the Motueka Catchment influence matrix

Figure 8 :
Figure 8: Results for a yeast transcriptional regulation network

Figure 9 :
Figure 9: C. elegans frontal neural network 4 and 5 for examples.Our overall proposed workflow for model comparison is summarized in Algorithm 3.

Table 1 :
Comparison summary statistics.Periodic (linear) directed structure is found to be preferred for networks in the first 8 (last 7) rows.