Capturing the interplay of dynamics and networks through parameterizations of Laplacian operators

We study the interplay between a dynamical process and the structure of the network on which it unfolds using the parameterized Laplacian framework. This framework allows for defining and characterizing an ensemble of dynamical processes on a network beyond what the traditional Laplacian is capable of modeling. This, in turn, allows for studying the impact of the interaction between dynamics and network topology on the quality-measure of network clusters and centrality, in order to effectively identify important vertices and communities in the network. Specifically, for each dynamical process in this framework, we define a centrality measure that captures a vertex's participation in the dynamical process on a given network and also define a function that measures the quality of every subset of vertices as a potential cluster (or community) with respect to this process. We show that the subset-quality function generalizes the traditional conductance measure for graph partitioning. We partially justify our choice of the quality function by showing that the classic Cheeger's inequality, which relates the conductance of the best cluster in a network with a spectral quantity of its Laplacian matrix, can be extended to the parameterized Laplacian. The parameterized Laplacian framework brings under the same umbrella a surprising variety of dynamical processes and allows us to systematically compare the different perspectives they create on network structure. ABSTRACT We study the interplay between a dynamical process and the structure of the network on which it unfolds using the parameterized Laplacian framework. This framework allows for deﬁning and characterizing an ensemble of dynamical processes on a network beyond what the traditional Laplacian is capable of modeling. This, in turn, allows for studying the impact of the interaction between dynamics and network topology on the quality-measure of network clusters and centrality, in order to effectively identify important vertices and communities in the network. Speciﬁcally, for each dynamical process in this framework, we deﬁne a centrality measure that captures a vertex’s participation in the dynamical process on a given network and also deﬁne a function that measures the quality of every subset of vertices as a potential cluster (or community) with respect to this process. We show that the subset-quality function generalizes the traditional conductance measure for graph partitioning. We partially justify our choice of the quality function by showing that the classic Cheeger’s inequality, which relates the conductance of the best cluster in a network with a spectral quantity of its Laplacian matrix, can be extended to the parameterized Laplacian. The parameterized Laplacian framework brings under the same umbrella a surprising variety of dynamical processes and allows us to systematically compare the different perspectives they create on network structure.


INTRODUCTION
1 As flexible representations of complex systems, networks model entities and relations between them as 2 vertices and edges. In a social network for example, vertices are people, and the edges between them 3 represent friendships. As another example, world wide web is a collection of web pages with hyperlinks 4 between them. An unprecedented amount of such relational data is now available. While discovery and 5 fortune await, the challenge is to extract useful information from these large and complex data. 6 Centrality and community detection are two of the fundamental tasks of network analysis. The goal 7 of centrality identification is to find important vertices that control the dynamical processes taking place 8 on the network. Page Rank (Page et al., 1999) is one such measure developed by Google to rank web 9 pages. Other centrality measures, such as degree centrality, Katz score and eigenvector centrality (Katz,10 1953; Bonacich, 1972;Bonacich and Lloyd, 2001;Ghosh and Lerman, 2012), are used in communication 11 networks for studying how each vertex contributes to the routing of information. Identifying central

defines a community
Empirical evaluation on real-world networks (Section 6): We apply our framework to study the 75 structure of several real-world networks. They are from different domains that embody a variety of 76 dynamical processes and interactions. We contrast the central vertices and communities identified by 77 different dynamical processes and provide an intuitive explanation for their differences. Keep in mind 78 that we do not claim any specific centrality or community structure measures to be the "best". We think 79 every outcome is potentially interesting among many possible perspectives. 80 We acknowledge that the theorems and algorithms presented in this paper are evolutions of existing 81 work. The emphasis of this paper, however, is about a theoretical framework that bring together important 82 concepts in network science. While the parameterized Laplacian framework described in this paper 83 cannot model every dynamical process of interest, it is still flexible enough to include a variety of 84 dynamical processes which are seemingly unrelated. It allows us to systematically study and compare 85 these processes under a unified framework. We hope this study will lead to better approaches for defining 86 and understanding the general interaction between dynamics and topologies. 87 2 BACKGROUND AND RELATED WORK 88 Before introducing our framework, we briefly review some closely related models. We will later show 89 that these existing models are special cases under the parametrized Laplacian framework. The intuition 90 about these well-known systems is helpful for understanding the motivation behind the framework. 91 We represent a network as a weighted, undirected graph G = (V, E, A A A) with n vertices, where for 92 i, j ∈ V , a i j assigns an non-negative weight (affinity) to each edge (i, j) ∈ E. We follow the tradition that 93 a i j = 0 if and only if (i, j) ∈ E; i.e., A A A is the weighted symmetric adjacency matrix. We assume a ii = 0 94 for all i ∈ V . In the discussion below, the (weighted) degree of vertex i ∈ V is defined as the total weight 95 of edges incident on it, that is, d i = ∑ j a i j . A dynamical process describes a state variable θ i (t) associated 96 with each vertex i. This variable changes its value based on interactions with the vertex's neighbors 97 according to the rules of the dynamical process 2 . 98 In this paper, since we view dynamics as operators on the vector composed of vertex state variables, 99 we adopt the linear algebra convention, i.e., using column vertex state vectors θ θ θ (t) and left-multiply them 100 by matrix operators 3 . Table 1 summarizes the terms and notation. One of the most-widely studied dynamical processes on networks is the random walk. The simplest is the discrete time unbiased random walk (URW), where a walker at vertex i follows one of the edges with a 2 It represents a probability vector in random walks, while becomes a belief vector in consensus processes. 3 This contrasts with the engineering convention where row vectors and right-multiplications are standards.

Manuscript to be reviewed
Computer Science probability proportional to the weight of the edge (Ross, 2014; Aldous and Fill, 2002). In this case, the state vector θ θ θ 4 forms a distribution whose expected value follows the update equation: Here P is a stochastic matrix whose entry P i j is the transition probability for a walker to go from the 103 vertex j to i, P i j = a i j /d j .

104
The update equation of an unbiased random walk leads to the difference equation where L RW is the normalized random walk Laplacian matrix with L RW = I I I − A A AD D D −1 A A A .

105
To go from a discrete time synchronous random walk to a continuous time dynamics, we introduce a waiting time function for the asynchronous jumps performed by the walk (Ross, 2014). Assuming a simple Poisson process where the waiting times between jumps are exponentially distributed as the PDF f (t, τ) = 1 τ i e Another closely related class of discrete time dynamical processes is the so-called the "consensus process" (DeGroot, 1974;Lambiotte et al., 2011;Olfati-Saber et al., 2007;Krause, 2008). Consensus process models coordination across a network where each vertex updates its "belief" based on the average "beliefs" of its neighbors. Unlike random walks, which conserves total state value throughout the network (since the state vector is always a distribution), the consensus process follows the following update equation This leads to the difference equation where L CON is the consensus Laplacian matrix with L Consensus can also be turned into asynchronous continuous time dynamics. Again, assuming a Poisson process where the update interval at each vertex is exponentially distributed as τ i (t) = 1 τ i e − t τ i , we can rewrite the above difference equations as differential equations, The consensus process always converge to a uniform "belief" state with the value,

124
In network clustering and community detection, previous work has focused on identifying subsets of 125 vertices S ⊆ V that interact more frequently with vertices in the same community than vertices in other 126 subsets (Fortunato, 2010;Porter et al., 2009). A standard approach to clustering defines an objective 127 function that measures the quality of a cluster. For a subset S ⊆ V , letS = V \ S denote the complement 128 of S, which consists of vertices that are not in S. Let cut(S,S) = ∑ i∈S, j∈S a i, j denote the total interaction 129 strength of all edges used by S to connect with the outside world. Let vol(S) = ∑ i∈S d i = ∑ i∈S, j∈V a i, j 130 denote the volume of weighted "importance" for all vertices in S.

131
One popular measure of the quality of a subset S as a potential good cluster (or a community) (Kannan et al., 2004;Spielman and Teng, 2004;Chung, 1997) is to use the ratio of these two quantities: For example, a subset that (approximately) minimizes this quantity -the conductance of S -is a desirable cluster, as it maximizes the fraction of affinities within the subset. If interactions among vertices are proportional to their affinity weights, then a set with small conductance also means that its members interact significantly more with each other than with outside members. The smallest achievable ratio over all possible subsets is also known as the isoperimetric number. As an important measure for mixing time in classic Markov chains, conductance has proven mathematical bounds in terms of the second eigenvalue of its Laplacian (Cheeger, 1970;Jerrum and Sinclair, 1988;Lawler and Sokal, 1988). Other well-known quality functions are normalized cut (Shi and Malik, 2000) and ratio-cut, given respectively by cut(S,S) vol(S) + cut(S,S) vol(S) and cut(S,S) min(|S|, |S|) .

138
While most community detection algorithms do not explicitly model the dynamical process that 139 defines the interactions between vertices, the connection between conductance and unbiased random 140 walks is quite well studied (Kannan et al., 2004;Spielman and Teng, 2004;Chung, 1997). In particular, 141 Chung's work on heat kernel page rank and Cheeger inequality, where a dynamical system is built using 142 the normalized Laplacian, provides a theoretical framework for provably good approximations to the 143 isoperimetric number (Chung, 2007). Intuitively, the relationship between clustering and dynamics can 144 be captured as: a community is a cluster of vertices that "trap" a random walk for a long period of time 145 before it jumps to other communities (Lovász, 1993

151
Consider a linear dynamical process of the following form: where θ θ θ is a column vector of size n containing the values of the dynamical variable for all vertices, and 152 L is a positive semi-definite matrix, the spreading operator, which defines the dynamical process.

153
As discussed in the introduction, we focus on dynamical processes that generalize the traditional normalized Laplacian for diffusion and random walks. Recall that the symmetric normalized Laplacian matrix of a weighted graph G = (V, E, A A A) is defined as We study the properties of a dynamical process that can be further parameterized as: We name this operator with parameters ρ, T T T ,W W W parameterized Laplacian and represent it using L 154 in the rest of the paper. Here T T T is the n × n diagonal matrix of vertex delay factors. Its i th element τ i 155 represents the average delay of vertex i. We assume that the operator is properly scaled: specifically, 156 τ i ≥ 1, for all i ∈ V . Another generalization from the traditional Laplacian is the use of the interaction 157 matrix W W W instead of the adjacency matrix A A A. In theory, W W W can be any n × n symmetric positive matrix.

158
Note that the degree matrix D D D W W W is now also defined in terms of the interaction matrix, that is d W W W i = ∑ j w i j .

159
While the ρ parameter can technically be any real number, in this work we limit ourselves to three special 160 cases: ρ = 1/2, 0, −1/2. These cases correspond to three equivalent linear operators with "consensus",

161
"symmetric" and "random walk" interpretations respectively.  seemingly unrelated dynamics, such as "consensus" and "random walk". To see this, we refer to the idea 169 of matrix similarity.

170
In linear algebra, similarity is an equivalence relation for square matrices. Two n × n matrices X and Y are similar if Recall that under our framework, the symmetric version of the parameterized Laplacian matrix is We can rewrite the operator describing random walk dynamics as: Thus, continuous time random walk with delay factors T T T is similar to the symmetric normalized Laplacian. Similarly, we can rewrite the continuous time consensus dynamics under our framework as The fact that "consensus," "symmetric" and "random walk" operators are similar means that they model 174 the same dynamics on a network, provided that we observe them in a consistent basis.

175
The random walk Laplacian matrix provides a physical intuition for our framework. An unbiased 176 random walk on the interaction graph W W W is equivalent to a biased random walk on the original adjacency

Scaling transformations
187 Uniform scaling One of the simplest transformations is uniform scaling, which is given by the diagonal matrix T T T with identical entries: where the scalar matrix Q can be rewritten as γI I I, where γ is a scalar. Uniform scaling preserves almost all 188 matrix properties, including the eigenvalue and eigenvector pairs associated with the operator.

189
Intuitively, uniform scaling can be understood as rescaling time by 1/γ. In other words, a bigger  Non-uniform scaling Non-uniform scaling enables us to use the T T T parameter to control the time delay at each vertex. Non-uniform scaling is written as where the diagonal matrix Q can have different entries. Unlike uniform scaling, this scaling does not Manuscript to be reviewed Computer Science

202
The last parameterization we explore is one that transforms the adjacency matrix of a graph, A A A, to the 203 interaction matrix W W W . Given an adjacency matrix A A A, the choice of W W W is a rather flexible design option.

204
In fact, we can arbitrarily manipulate the adjacency matrix as long as the result is still a positive and 205 symmetric matrix, for any perceived dynamics.

206
In this paper, we limit our attention to bias transformations of the original adjacency matrix A A A. We 207 call them the reweighing transformations. Whereas the scaling transformation changes the delay time at 208 each vertex, the reweighing transformation changes the trajectory of the dynamic process. Note that this 209 transformation also changes the degree matrix D D D W W W .

210
As described in Section 2, a biased random walk with transition probability P i j ∝ b i a i j is equivalent to an unbiased random walk on an "interaction graph," represented by the reweighed adjacency matrix: where we constrain b i > 0. This transformation allows the parameterized Laplacian to model many This operator is often used to describe heat diffusion processes (Chung, 2007), where L is replacing the 223 continuous Laplacian operator ∇ 2 .

224
Notice that by setting the "random walk" and "consensus" formulations are exactly the same as the symmetric formulation. 227 We can then construct a diagonal matrix V V V A A A whose elements are the components of the eigenvector − → v A A A .
Let us scale the adjacency matrix according to W W W = V V V A A A A A AV V V A A A and use it as the interaction matrix. Setting the vertex delay factor to identity, the spreading operator is: where the entries in D D D W W W simplifies as This means that both dynamics have exactly the same state vector θ at each time step. In particular, the The consensus formulation of the replicator gives a maximum entropy agreement dynamics: Unbiased Laplacian Reweighing each edge by the inverse of the square root of the endpoint degrees gives the what is known as the normalized adjacency matrix Chung, 1996). Then, the degree of vertex i of the reweighted graph is W W W we define the unbiased Laplacian matrix: Unbiased Laplacian is an example of the degree based biased random walk with P i j ∝ d −1/2 i a i j (Section 2). An URW on the reweighed adjacency matrix W W W is equivalent to a BRW on the original adjacency matrix of the following dynamics The stationary distribution for this class of BRWs in general is Equivalent to the (scaled) graph Laplacian of the normalized adjacency matrix, the diagonal matrix 236 T T T D D D W W W of the unbiased Laplacian is also effectively a scalar. As a result, the "random walk" and "consensus" 237 formulations are exactly the same as the symmetric formulation.  Manuscript to be reviewed

Computer Science
process. It is sometimes desirable to define a centrality measure as a function of time (Taylor et al., 2015).

248
In this paper, however, we stick to the more conventional notion of time-invariant centralities.

249
The various centrality measures introduced in the past have lead to very different conclusions about 250 the relative importance of vertices (Katz, 1953;Bonacich, 1972

255
A vertex has high centrality with respect to a random walk if it is visited frequently by it. This is specified by the distribution of the dynamic process at time t: where θ θ θ (0) is the state vector describing the initial distribution of the random walk. The stationary distribution of the random walk: because with π π π being the vector with π entries and Π being the diagonal matrix with the same elements. By 256 convention, π π π is the standard centrality measure in conservative processes, including random walks 257 (Ghosh and Lerman, 2012).

258
If we define centrality as the stationary distribution of a random walk, the importance of a vertex can 259 be thought of as the total time a random walk spends at the vertex in the steady state. This is proportional 260 to both vertex degree and delay factor, which we will later relate to the volume measure. If L RW is a 261 normalized Laplacian, this centrality measure is exactly the heat kernel page rank (Chung, 2009), which 262 is identical to degree centralities since W W W = A A A and T T T = I I I.

264
In consensus processes, the state vector always converges to a uniform state, where each vertex has the same value of the dynamic variable. As a result, the stationary distribution is not an appropriate measure of vertex centrality, since it deem all vertices to be equally important. However, the final consensus value associated with each vertex is where weight of vertex i in this average is Intuitively, as a measure of importance, it make sense to define the centrality of a vertex in the 266 consensus process as its contribution to the final value. This consistency between "consensus" and 267 "random walk" leads us to define the parameterized centrality.

Manuscript to be reviewed
Computer Science The similarity transformations between "consensus" and "random walk" dynamics Table 2. Stationary and initial state vectors of different formulations of the parameterized Laplacian. Formulations where in the last step we used matrices to simplify the notation, with Λ being the diagonal matrix of .., − → v n } as columns and U T = V −1 . One interesting observation is that by left multiplying both sides with U T , we have Recall that U T θ θ θ is a vector in the eigenbasis V. Applying the operator L to any input vector simply re-scales it according to eigenvalues. Since the smallest eigenvalue of the parameterized Laplacian is always 0, we have , which states that the state vector is conserved along the direction of the dominant eigenvector − → v 1 .

277
The state vector reaches a stationary distribution π Since all terms vanish as t → ∞, the stationary state vector π only depends on − → v 1 . z 1 − → v 1 qualifies as a 278 time invariant, initialization-independent vertex centrality measure.
279 Table 2 summarizes the properties of the stationary distributions and centralities associated with Manuscript to be reviewed

Computer Science
As the table shows, similarity transformations of the same operator give the same the state vector θ θ θ , as long as the input and output vectors are properly transformed into the correct basis. They represent the same dynamics in different coordinate systems. Since centrality is determined by the dynamic process on a given network, it should be unified across these similarity transformations. In theory, any coordinate system can be set as the standard. Here, following the intuitions described earlier, we define the unnormalized stationary state vector of the random walk as the parameterized centrality: Another motivation behind this definition is to establish a direct connection between centrality and 287 community measures, as we will later demonstrate with the notion of parameterized volume (23).

289
Parameterized centrality includes many well known centrality measures as special cases. Below, we 290 summarize the induced special cases discussed in the previous subsection. 297 transform has no effect on parameterized centrality by definition.  The above conditions ensure that the quality function is solely determined by the choice of commu-318 nities, network structure and the interactions between vertices. We assume that the underlying network 319 structure remains static as the dynamics unfolds. Similar to parameterized centralities, we focus on 320 the time-invariant communities. There is a catch, however, by simply dividing each vertex into its own 321 community, we would have a optimal but trivial community division. Therefore, we need additional 322 constraint on the size of the communities.

323
A closely related problem in geometry is the isoperimetric problem, which relates the circumference 324 of a region to its area. Isoperimetric inequalities lie at the heart of the study of expander graphs in graph 325 theory. In graphs, area translates into the size of the vertex subset, and the circumference translates into 326 the size of their boundary (Chung, 1997). In particular, we will focus on the graph bisection (cut) problem, one corresponds to a unified community measure for a class of similar operators including seemingly 334 different formulations of "consensus," "symmetric" and "random walk" 6 . 335

336
Recall that conductance is a community quality measure associated with unbiased random walks.
where vol(S) = ∑ i∈S d i and cut A A A = ∑ i∈S, j∈S a i j .

337
We generalize this notion with a claim that every dynamic process has an associated function that 338 measures the quality of the cluster with respect to that process. Optimizing the quality function leads to 339 cohesive communities, i.e., groups of vertices that "trap" the specific dynamic process for a long period of 340 time.

341
Consider a dynamic process defined by the spreading operator L = T T T −1/2 D D D T T T −1/2 . We define the parameterized conductance of a set S with respect to L as: The minimum over all possible S is the parameterized conductance of the graph, Notice that we have also defined the parameterized volume of a set S ⊆ V as which is the sum of parameterized centralities of member vertices. Using the random walk perspective, 342 the numerator measures the random jumps across communities, while the denominator ensures a balanced 343 bisection. As previously pointed out, the presence of a good cut implies that it will take a random walk

349
We can use any transformation to produce new dynamics, and the corresponding parameterized conductance will be redefined according to Equation (21), However, the effect of transformations on the resulting communities is not as obvious when compared  The reweighing transformation is the most complex of all, changing both the numerator and denomi-365 nator in Equation (21). This trade-off between cut and balance can oftentimes be very complicated to 366 analyze (as will be seen with real world networks).

367
Finally, we summarize the induced special cases.

(Scaled) Graph Laplacian W W W = A A A and T T T
This is the ratio cut scaled by 1/d max .

Replicator W W W = V V V A A A A A AV V V A A A and T T T = I I I. Recall − → v A A A is the eigenvector of A A A associated with the largest
Since the degree of a vertex in the interaction graph W W W is d W W Notice that here the parameterized conductance for graph Laplacian and unbiased Laplacian share 373 the same denominator even though they are related through both reweighing and scaling transformations.

374
This is a result of their scaling cancelling out the reweighing effect on volumes (centralities). This is part 375 of the motivation behind our design of the unbiased Laplacian operator for easier comparisons. Another 376 simple obseravation is that graph Laplacian shares the same numerator with its normalized counterpart. 377 We will be using these relationships for analyzing experimental results in the next section.

379
Given the parameterized conductance measure, finding the best community bisection is still a combinatorial problem, which quickly becomes computationally intractable as the network grows in size. In this subsection we will extend the theorems for the classic Laplacian to our parameterized setting, ultimately leading to efficient approximate algorithms with theoretical guarantees. For mathematical convenience we will use the symmetric formulation and assume that ρ = 0 for L . Cheeger inequality Cheeger (1970) states that

Manuscript to be reviewed
Computer Science enables the use of its eigenvectors for partitioning graphs, particularly the nearest-neighbor graphs and 382 finite-element meshes (Spielman and Teng, 1996).

383
In this section, we generalize Cheeger's inequality to any spreading operator under our framework 384 and its associated parameterized conductance of the graph (given by Eq. 22 Cheeger's inequality to accommodate the asyncronized delay factors in T T T . It also comes with algorithmic 387 consequences, leading to spectral partitioning algorithms that are efficient in finding low conductance 388 cuts for a given operator. where φ L (G) is given by Eq. 22.

390
Proof. We prove the theorem by following the approach for proving the classic Cheeger inequality (see 391 (Chung, 1997)).
Instead of sweeping the vertices of G according to the eigenvector f itself, we sweep the vertices of the graph G according to g by ordering the vertices of G so that and consider sets S i = {v 1 , · · · , v i } for all 1 ≤ i ≤ n.

398
Similar to (Chung, 1997), we will eventually only consider the first "half" of the sets S i during the 399 sweeping: Let r denote the largest integer such that vol L (S r ) ≤ vol L (V )/2. Note that where the first equation follows from ∑ v g[v]d v τ v = 0. We denote the positive and negative part of g − g[v r ] as g + and g − respectively:

Manuscript to be reviewed
Computer Science Without loss of generality, we assume the first ratio is at most the second ratio, and will mostly focus on 402 the vertices {v 1 , ...., v r } in the first "half" of the graph in the analysis below. Thus, which follows from the Cauchy-Schwartz inequality. 404 We now separately analyze the numerator and denominator. To bound the denominator, we will use 405 the following property of τ i : Because L is properly scaled, τ i ≥ 1 for all i ∈ V . Therefore, Hence, the denominator is at most To bound the numerator, we consider subsets of vertices S i = {v 1 , · · · , v i } for all 1 ≤ i ≤ r and define By the definition of φ L (G), we know φ L (G) ≤ min i h L (S i ) for all 1 ≤ i ≤ r, where recall the function By orienting vertices according to v 1 , ..., v n , we can express the numerator Rewrite the difference as a telescoping series By Eqn: 28 By Eqn. 27 and g + (v n ) = 0 Combining the bounds for the numerator and the denominator, we obtain λ 2 ≤ φ 2 L /2 as stated in the 413 theorem. The upper bound of λ 2 follows from the same argument for the standard Cheeger inequality.

415
The parameterized Cheeger inequality is essential for providing theoretical guarantees for greedy com- • Let vector g be g • Sweeping: For each • Output the S i with the smallest h L (S i ).
Before stating the quality guarantee of the above algorithm, we quickly discuss its implementation and Manuscript to be reviewed

Computer Science
The following proposition follows directly from the algorithm and Theorem 7.2 of (Spielman and To use this spectral approximation algorithm (and in fact any numerical approximation to the second 430 eigenvector of L ) in our spectral partitioning algorithm for the dynamics, we will need a strengthened 431 theorem of Theorem 1.

432
Theorem 2. (Extended Cheeger Inequality with Respect to Rayleigh Quotient) For any interaction graph G = (V, E,W W W ) and vertex scaling factor T T T , (whose diagonals are (τ 1 , ..., τ n )), for any vector u such that u ⊥ D D D Proof. The theorem follows directly from the proof of Theorem 1 if we replace vector f (the eigenvector 434 of associated with the second smallest eigenvalue of L ) by u. This theorem is the analog of a theorem by 435 Mihail (Mihail, 1989) for Laplacian matrices.
. Manuscript to be reviewed Computer Science properties. We treat all as undirected networks.

452
To compare the different perspectives on network structure obtained under the parameterized Laplacian  The first network we study is a social network consisting of 34 members of a karate club in a university,

472
where undirected edges represent friendships (Zachary, 1977). This well-studied network is made up of profiles are very similar as well (Figure 3a, Figure 3b). This is a excellent example showing that most 482 good community measures capture the same fundamental idea of communities, those well-interacting 483 subsets of vertices with relatively sparse connection in between. They do differ, however, in finer details of 484 their mathematical definitions, as we will see in more complicated networks in the following subsections.  bisections under each operator are given below.

519
The "House of Representatives" network is an excellent example of how centralities and communities 520 are closely related under our framework. First, the centrality profile of this network looks similar to that 521 of the College Football, but quite different from the other networks in Table 3. Because we have taken 522 into account all votes, this network is very densely connected, and its degree distribution also has a heavy 523 tail as demonstrated by the red curve in Figure 5a.

524
Since the degree distribution is relatively uniform, we expect the change of the cut size (numerator) in 525 Eq. 21 to be relatively small. The exception here is the optimal bisection produced by the regular Laplacian 526 (Figure 5d), which is most prone to "whiskers", leading to a low accuracy of 38.5%. For the other three 527 special cases, the volume balance (denominator) is the determining factor in communities measures, and 528 all produce fairly "balanced" bisections according their own parametrized volume measures.

529
Another observation is that centrality measures disagree about importance of vertices. In particular, 530 centralities given by the normalized Laplacian might differ from those of the unbiased Laplacian by the 531 degree, but given its relative uniform distribution, leads to almost identical optimal bisections (Figure 5c, 532 Figure 5f). The replicator, on the other hand, scales vertex centrality according to eigenvector centralities, 533 which places more volume to the high degree vertices on the cyan cluster. The resulting optimal bisection 534 is thus shifted to the right to balance volumes (Figure 5e). In this case, the ground truth aligns closer to 535 the formers with over 90% accuracies as Democrats dominated the 98th Congress.   Manuscript to be reviewed

Computer Science
The Political Blogs network demonstrates a pitfall of many commonly used community quality 543 measures. Many real world networks have a skewed degree distributions, which often corresponds to 544 a "core-whiskers" (also known as core-periphery) structure. As shown in Leskovec et al. (2008), such 545 structures have "whisker" cuts that are so cheap that balance constrains can be effectively ignored. The 546 same happened here for three of our special cases, whose optimal bisections are highly unbalanced. They 547 have below 50% accuracies when compared to the ground truth.

548
Unlike the House of Representatives, community measure in Political Blogs is dominated by the cut 549 size (numerator). In particular, both the normalized Laplacian and the Laplacian share the same cut size 550 measures, give the same solution (Figure 6c, Figure 6d), despite their differences in volume/centrality 551 measures (see curves in Figure 6a). The unbiased Laplacian produces a different whisker cut, because it 552 has a reweighed cut size measure (Figure 6f). Further investigation reveals that the unbiased Laplacian 553 cuts off a whisker from two highly connected vertices, which according to Eq. 21 greatly reduces the cut 554 size.

555
The exception here is the replicator operator (Figure 6e). By reweighing the adjacency matrix by 556 eigenvector centralities, the parameterized volume measure now considers highly connected vertices 557 near the core to be even more important (see the red curve in the centrality profile). The difference 558 in parameterized volume is now too drastic to be ignored. As a result, replicator does not fall for the 559 "whisker" cuts and produces balanced communities with a respectable accuracy of 95.3%.  States power grid (Watts and Strogatz, 1998). Among the six datasets in Table 3, Power Grid is the largest 581 network in terms of the number of vertices. However, it is extremely sparse with an average degree of 582 2.67, leading to a homogeneous connecting pattern across the whole network without core-periphery 583 structure. Its centrality and sweep profiles and visualizations of optimal bisections are given below.

584
The long tails of the centrality profiles indicate existence of high degree vertices, or hubs Figure 8a. 585 However, as the visualizations of network bisection show, these hubs do not usually link to each other 586 directly, resulting in negative degree assortativity (Newman, 2003). This is consistent with the geographic 587 constrains when designing a power grid, as the final goal is to distribute power from central stations to 588 end users. These important difference in overall structure prevented core or whiskers from appearing, and   to the homogeneous connecting pattern. Normalized Laplacian share the same cut size measure with the 598 regular Laplacian, and its volume balance is usually more robust on social networks with core-whisker 599 structures. On Power Grid, however, it opts for a smaller cut size at the cost of volume imbalance 600 (Figure 8c). It turns out the volume of the cyan cluster is compensated by its relative high average degree.

602
The parameterized Laplacian framework presented in this paper can describe a variety of dynamical 603 processes taking place on a network, including random walks and simple epidemics, but also new ones, 604 such as one captured by the unbiased Laplacian. We extended the relationships between the properties of 605 centrality, community-quality measures and properties of the Laplacian operator, to this more general 606 setting. Each dynamical process has a stationary distribution that gives centrality of vertices with respect 607 to that process. In addition, we show that the parameterized conductance with respect to the dynamical 608 process is related to the eigenvalues of the operator describing that process through a Cheeger-like 609 inequality. We used these relationships to develop efficient algorithm for spectral bisection.

610
The parameterized Laplacian framework also provides a tool for comparing different dynamical the evolution of the dynamic process. In the analysis of massive networks, it is also desirable to identify 616 subsets of vertices whose induced sub-graphs have "enough" community structure without examining 617 the entire network. Chung (Chung, 2007(Chung, , 2009) derived a local version of the Cheeger-like inequality 618 to identify random walk-based local clusters. Similarly, our framework can be adapted to such local 619 clustering procedures.

620
While our framework is flexible enough to represent several important types of dynamical processes, 621 it does not represent all possible processes, for example, those processes that even after a change of basis, 622 do not conserve the total volume. In order to describe such dynamics, an even more general framework is 623 needed. We speculate, however, that the more general operators will still obey the Cheeger-like inequality, 624 and that other theorems presented in this paper can be extended to these processes.