Algebraic Algorithms for Betweenness and Percolation Centrality

In this paper, we explored different ways to write the algebraic version of betweenness centrality algorithm. Particularly, we focused on Brandes’ algorithm [8]. We aimed for algebraic betweenness centrality that can be parallelized easily. We proposed 3-tuple geodetic semiring as an extension to the usual geodetic semiring with 2-tuples [6]. Using the 3-tuple geodetic semiring, Dijkstra’s and Brandes’ algorithm, we wrote more concise and general algebraic betweenness centrality (ABC) algorithm which is valid for weighted and directed graphs. We also proposed an alternative version of ABC using the usual geodetic semiring with 2-tuple where we used a simple way to construct shortest path tree after computing shortest path distances in the usual geodetic semiring. This allows us to avoid computational complexity of ABC implementation using 3-tuple geodetic semiring. We used numba [18] to optimize and parallelize ABC. We evaluated the performance of ABC using 2-tuple geodetic semiring as compared to NetworkX [16], a common python package for graph algorithms. We did scalability experiments on parallel ABC and showed its total speedup. We also showed that with small modification, ABC can be adapted to algebraicly compute other centrality measures such as percolation centrality.


Introduction
Algorithms for computing centrality measures on graphs have received a great deal of attention in recent decades due to problems of interest across fields in which data has a graph structure. Centrality measures often help answer domain-specific questions about the importance or relevance of individual vertices in a graph. The use of parallel computing is of interest for calculating graph features when the problem size exceeds the capabilities of classical sequential or small-scale multicore CPUs. One mechanism for implementing such algorithms on parallel computers is to express them in the form of linear algebra operations that naturally map onto a data-parallel programming model. The GraphBLAS [1] and Graph500 [2] are recent notable efforts that adopt this model of graph algorithms via linear algebra for parallel computation. This is particularly relevant with the recent trend towards data parallel accelerators like graphics processing units.
In this paper we introduce and analyze algebraic algorithms for computing betweenness centrality measures using the tools of linear algebra. We demonstrate how this can be applied to other centrality measures that are based on betweenness centrality by defining an algebraic implementation of the percolation centrality algorithm [20].

Contribution
Our work contributes the following results to the existing field of graph algorithms in the language of linear algebra: • Our algorithm for algebraic betweenness centrality (ABC) is valid for weighted and/or directed graphs, • ABC is fully algebraic via a geodetic semiring, avoiding non-algebraic auxiliary functions, • ABC forms the basis of other centrality measures expressed in terms of linear algebra, such as percolation centrality.
• We propose a 3-tuple geodetic semiring, extending a more commonly used geodetic semiring by adding parent information.
Betweenness centrality is a commonly studied graph algorithm and has been approached with the tools of linear algebra in the past. Early work focused on the unweighted case: [21] first formulated algebraic algorithm for betweenness centrality on unweighted graphs using linear algebra primitives and [9] implemented it as the distributed-memory betweenness centrality algorithm. As [9] noted their implementation can scale beyond thousands of processors, but only addresses the unweighted case. Shortly afterwards, [12] proposed a new parallelizable algorithm for the general case with low spatial complexity scaling to 64 processors on both weighted and unweighted graphs, but it wasn't algebraic.
The closest work to ours appears in [23], which describes the Maximal Frontier Betweenness Centrality (MFBC) method based on sparse matrix multiplication for both unweighted and weighted graphs using a parallel (distributed-memory) numerical library for multidimensional arrays called Cyclops [24]. MFBC is based on Bellman-Ford and Brandes' algorithms [8] and its overall logic is similar to the algorithm ABC-DB that we propose. Unlike MFBC, ABC-DB is based on Dijkstra's and Brandes' algorithms, and fully formulated based on linear algebra primitives. For example, MFBC uses a combination of different algebraic monoids and auxiliary functions to define matrix-vector and matrix-matrix multiplication. ABC-DB works with matrices and vectors from a single domain, referred to as the 3-tuple geodetic semiring. The semiring operations are used to implement the necessary logic of the algorithm which results a simple algebraic implementation of Brandes' algorithm for weighted and unwieghted graphs. MFBC uses matrix-vector multiplication with a monoid and an auxiliary function to find frontiers for back-propagation and centrality accumulation because it is based on Bellman-Ford. Tracking frontiers is not necessary with ABC-DB due to the order of traversal from Dijkstra's algorithm and parent information in the shortest path tree from the geodetic semiring. This allows us to back-propagate centrality scores simply. Thus, ABC-DB only uses vector operations to back-propagate and accumulate centralities.

Graph notation and background
Let G = (V, E), E ⊆ V × V be a graph with the set of nodes and the set of edges denoted as V and E respectively where each edge has a weight w : E → R = R ∪ (∞). We denote |V | = n and |E| = m. The following concepts on a graph such as a path, shortest path and shortest path distance are briefly defined as follows. These will be used later in this paper to understand the graph centrality algorithms.
and the weight of a path is defined as the sum of the weights of the edges which constitutes the path.

Betweenness Centrality
In complex networks, we need some measure to tell which nodes play more important roles. One such measure is betweenness centrality. The betweenness centrality of a vertex represents what portion of all shortest paths in the graph go through the vertex. Let σ st be the number of shortest paths between vertices s and t and σ st (v) be the number of shortest paths between s and t that includes the vertex v. Then betweenness centrality is defined as Intuitively, betweenness centrality of a vertex represents the change of the number of shortest paths in the network if we remove a vertex from the graph. The larger betweenness centrality of a vertex, the bigger change in total shortest paths. Another interpretation of betweenness centrality of a vertex is the probability that communication between any two vertices in a network goes through v, assuming all communications in the network happen through shortest paths. This assumption could become disadvantageous depending on a context. There are several terms related to betweenness centrality including Bellman criterion, pair dependency and dependency of a vertex which are defined as follows [8].
Lemma 1 (Bellman criterion) A vertex v ∈ V lies on a shortest path between vertices s, t ∈ V , if and only if d(s, t) = d(s, v) + d(v, t).
By the Bellman criterion the shortest path counts between s, t ∈ V passing through v ∈ V is which can be used to find the pair dependency of s, t on v, δ st (v).
Definition 1.4 (pair dependency) The pair dependency of s, t ∈ V on an intermediary vertex, v ∈ V is the portion of shortest paths between s and t that pass through v, defined as: If we sum all pair dependencies on v, we get the betweenness centrality of v: Definition 1.5 (dependency) The dependency of a vertex s ∈ V on a single vertex v ∈ V is the portion of shortest paths starting from s and going through v, that is defined as follows.
If we sum dependencies of all s on v, we get betweenness centrality of v as follows.
Brandes proved that the dependency of s on v follows a recursive relation.
Theorem 1 (Recursive relation of dependency [8]) The dependency of s ∈ V on any v ∈ V obeys where P s (w) is the set of parents of a vertex w on shortest paths from s to w.
• The commutativity, associativity and distributivity holds for countable cases as well.
A semiring (S, ⊕, ⊗, 0, 1) is closed iff a = 1 + a ⊗ a = 1 + a ⊗ a where is said to be the closure operation. A complete semiring is closed for the transitive closure defined as A matrix semiring over a complete semiring is also complete and closed for Therefore, we can compute the transitive closure of the adjacency matrix A of a graph over a given complete semiring (S, ⊕, ⊗, 0, 1) via Equation 10. If we say transitive closure of a graph, it means the transitive closure of the adjacency matrix of a graph. There are well-known works [4,14,17,6,10] on semirings and [4,14] present theorems for path problems in graphs including the transitive closure of a graph.

Related works
Computing betweenness centrality has two parts, to compute All Pairs Shortest Paths (APSP) and to sum all pair-dependencies. The second part has Θ(n 3 ) time complexity and takes up a significant amount of computation time and storage space. In 2001, Brandes [8] introduced a faster algorithm which uses Single Source Shortest Paths (SSSP) and the recursive relation in dependency of a vertex to update centrality scores. By doing so, it eliminates computing APSP and calculates centrality scores without storing all shortest paths. As a result, Brandes' algorithm has time complexity of O(m + n log n), O(nm + n 2 log n) for unweighted and weighted graphs respectively. Since Brandes' algorithm is based on a priority queue data structure, its space complexity is O(m) and O(nm) for unweighted and weighted graphs respectively. The use of a priority queue limits parallelization and vectorization opportunities, especially those based on formulating the algorithm algebraically.
Many different semirings have been defined in the past for expressing graph algorithms in algebraic form, such as the tropical semiring for shortest path problems [17]. In 1974, Aho and Hopcroft [4] first proposed using semirings for path finding algorithms such as computation of costs between vertices, the transitive closure of a graph and shortest paths. These use different semirings or ways to define the adjacency matrix of the graph, but the algorithms themselves were not written in algebraic form. While [4] defined a closed semiring with a idempotent property, in 1980 [13] redefined a closed semiring without the idempotent property and proposed a general algorithm for computing costs between vertices. In 1994, Batagelj [6] introduced the geodetic semiring, a nonidempotent closed semiring and computed the transitive closure of a graph using Fletcher's algorithm [13]. Computing the transitive closure of a graph in the geodetic semiring produces the shortest distance and number of shortest paths between each pair of vertices. All these algorithms take O(n 3 ) time and they are not written in algebraic form, but they are implicitly algebraic. Given that a matrix formulation is possible for them, one can always write an algebraic version of the algorithms. These foundational algorithms are related to the work presented here because shortest path identification is core to betweenness centrality algorithms.
In the last two decades work on graph algorithms expressed in terms of linear algebra has been an active research area, especially with the rise of data parallel compute accelerators. In 2008, Robinson [21] formulated betweenness centrality based on matrix operations for unweighted graphs. Their work is based on the breadth first search (BFS) linear algebra primitive and included in Kepner's text that summarizes early work in the field [17]. Robinson's algorithm has time complexity of O(n 2 + nm) and space complexity of O(n). Robinson's algorithm implements Brandes' algorithm for unweighted graph and uses algebraic BFS (matrix-vector products) for SSSP and keep track of the level of the node in the tree with the source node as a root. To update centrality scores, the tree is traversed up from leaf nodes or from the bottom level and the parents of a node in the shortest path tree is determined from the adjacency matrix when back track. The CombBLAS [9] is an extensible distributed-memory parallel graph library that implements Robinson's algorithm. Most parallelizations of betweenness centralty is based on the BFS primitive [5,19,26,25,15]. Most recently, Solomonik [23] proposed an algebraic betweenness centrality algorithm called Maximal Frontier Betweenness Centrality (MFBC) for both unweighted and weighted graphs. MFBC is the closest to our work as discussed earlier.
Many algebraic graph algorithms are presented in [17] written in terms of matrix-vector products. Among them, a 3-tuple shortest path semiring is defined and used to express an algebraic Bellman-Ford algorithm. This 3-tuple consists of a path weight, path size and the penultimate vertex. The 3-tuple geodetic semiring that we defined for our work is similar to this semiring.

3-tuple Geodetic Semiring
Given u, v ∈ V , we want to represent the shortest path distance, number of shortest paths and a set of penultimate vertices (parents of end nodes of the shortest paths) as a tuple of a form (d, σ, π).
and (∞, 0, ∅), (0, 1, N IL) correspond to nonexistence of shortest paths (no path) and a path from a vertex to itself (self loop) respectively. We want to define addition ⊕ = min and multiplication = + rhs operations in S as follows.
Intiutively, + rhs concatenates paths represented by 3-tuples and min picks the shortest paths from given paths. If we concatenate no path with any path, the result is no path. If we concatenate a self loop with a valid path ( not a self loop or no path), the result is the valid path itself. If we concatenate two valid paths, their weights will be combined and the parent information from the rightmost path will be that of the resulting path(s) (tuple). Since 3-tuple represents number of shortest paths from one vertex to another through σ, the resulting σ will be the multiplication of sigmas of the two paths concatenated. Proof: The first part is a similar to the proof of Lemma 5.2 from [17] 1. min operation is clearly associative and commutative.

Consider
and do the operation in any order. The result is Since + rhs defined Equation 12 is a path concatenation operation, • if any π i = ∅, then π = ∅.
• if all π i = ∅. Then π = π i , where i is the largest value that π i = N IL.
Now we need to check the case when d 1 = d 2 such that (ii) Suppose p = (0, 1, N IL) = 1. Then Similarly for right distributivity.
We eliminated the case, (d + d * = 0, σ = 0) since the only element with σ = 0 is the non existing path, (∞, 0, ∅). Now we present algebraic versions of shortest path algorithms including Dijkstra, Bellman-Ford in the 3-tuple geodetic semiring. Let us denote this semiring as GS3 in short. Let G = (V, E) be a weighted, directed graph with edge weights w : E → R. We define an n × n adjacency matrix A of a graph as So that, elements of the adjacency matrix are from 3-tuple geodetic semiring. If the graph have self-loops, the self-loops are eliminated in the process of constructing the adjacency matrix.
Algorithm 1 Algebraic Dijkstra In Algorithm 1, the inputs are an n × n adjacency matrix A in 3-tuple geodetic semiring as defined in 14 and a source node s. Q is a n-dimensional vector used to indicate which vertices have been traversed so far. For example, if a vertex v is traversed, then Q[v] = ∞. Otherwise, Q[v] = 0. In Line 1, Q is initialized with 0 and Q[s] = ∞ which indicates that the source node is already traversed. In Line 2, d is the shortest path distance vector and we initialize it with s th row of A since A[s, :] contains shortest path distances from s to all other vertices after the level 1 traversal. The while loop between Line 3 -7 iterates until all the vertices are traversed. In Line 4, d d means the first (distance) component of 3-tuple shortest path distance vector and u is the next node closest to s in the shortest path tree where + operation is the usual plus operation. In Line 4, we also indicate that u is traversed. In Line 5, we get the new distance vector d when we explore the resulting paths to the next level through the vertex u. In Line 6, we take element-wise min between d and d and assign its result as the shortest path distance vector. + and min operations in Line 5 and 6 are 3-tuple geodetic semiring operations. At the end of the while loop, d contains the shortest path distance vector in the 3-tuple geodetic semiring, but d[s] will be incorrect. So we need to assign d[s] = (0, 1, N IL) in Line 8.
In Algortihm 2, the inputs are the same as the inputs to the Algebraic Dijkstra's algorithm 1. But as compared to Algebraic Dijkstra's algorithm 1, Algebraic Bellman-Ford algorithm 2 is much concise and straight-forward. In Line 1, the shortest path distance vector, d is initialized as s th row of A since A[s, :] contains shortest path distances from s to all other vertices after the level 1 traversal. In Line 2, we declared another temporary vector, d k which is used to keep track of the paths with k-hops in the k th level. In the for loop between Line 3 -6, Line 4 further explores k-hop paths and Line 5 takes the element-wise minimum between the shortest path distance vector d and k-hop path distance vector d k . As a result, after n − 1 iterations d contains the shortest path distances where n − 1 iteration is enough because any shortest path in a graph has maximum n − 1 length (hops).

Algebraic Algorithms for Betweenness Centrality
Algorithm 3 Brandes's Algorithm in unweighted graphs [8] 1:  In Section 1, we briefly discussed different methods to compute betweenness centrality and there are mainly two ways. One is the traditional method to compute betweenness centrality based on APSP and the another is Brandes' algorithm based on SSSP. Since computing betweenness centrality based on APSP is very straight forward, the discussion in Section 1 is sufficient. Brandes' algorithm is more interesting because it enables computational efficiency. We will include Brandes' algorithm below to refer to when explaining its algebraic version.
Let G = (V, E) be a weighted, directed graph with edge weights w : E → R and let A be the adjacency matrix of G in 3-tuple geodetic semiring defined the same as Equation 14. We want to write algebraic betweenness centrality algorithms based on some of the shortest path algorithms in 3-tuple geodetic semiring presented in Section 2. We are going to present two algebraic betweenness centrality algorithms named Algebraic BC -Dijkstra-Brandes (ABC-DB) and Algebraic BC -Bellman-Ford (ABC-BF). To compute betweenness centrality using Brandes algorithm, we need to choose a SSSP algorithm. Even though other semirings such as the tropical semiring for shortest paths can be used with Bellman-Ford algorithm to compute the shortest path distances, they don't count the number of shortest paths that we need in order to update BC scores. But shortest path algorithms (Dijkstra's 1 or Bellman-Ford 2) in 3-tuple geodetic semiring can count shortest path distances, number of shortest paths and penultimate vertices. Since Dijksra's algorithm can give the order of traversal, we used Dijkstra's algorithm in Brandes's algorithm with 3-tuple geodetic semiring as follows.

Algebraic BC -Dijkstra-Brandes (ABC-DB)
Algorithm 4 ABC-DB GS3 Input: A (n × n adjacency matrix) Output: C B (Betweenness Centrality) In Algorithm 4, Algebraic BC -Dijkstra-Brandes with GS3, in Line 1, we first initialize betweenness centrality for all vertices with 0. Then in the for loop between Line 2 and 23, we iterate over each vertices. Firstly, we find the shortest path tree rooted at the corresponding vertex s using Dijkstra's algorithm 1 where S is an array with length n − 1 for saving vertices in the order of traversal and i is a counter variable to indicate the order of traversal. Secondly, we updated betweenness centrality score using the shortest path tree rooted at s. Line 17, initializes the dependency of s ∈ V on every other vertex as 0. Dependency of a vertex is defined in Definition 1.5. In the for loop at Line 18, we iterate S in reverse direction, from back to front. As a result, we will update betweenness centrality scores starting from the bottom of the shortest path tree rooted at s. In Line 20 and 21, the dependency of s on the parents of a vertex w on shortest paths from s is accumulated using Theorem 1. Line 21 updates betweenness centrality of w by adding the dependency of s on w to betweenness centrality of w. These two parts are repeated for each all different source vertex s and the dependency of s for any v ∈ V is accumulated for each iteration of loop between Line 2 and 24 and corresponding updates for betweenness centrality are done. At the end of algorithm, all betweenness centrality scores are correctly calculated. It is clear to compare this algorithm with Brandes' algorithm 3. So we exclude the first part for computing SSSP in both algorithms and explain the second part for accumulating betweenness centrality scores. The while loop between Line 26-34 in Brandes' algorithm and the for loop between Line 18-23 in ABC-DB are both for traversing the vertices in non-increasing order of their distance from s [8]. Line 19 of this algorithm corresponds to Line 27 in Brandes' algorithm. Line 20-21 corresponds to Line 28-30 in Brandes algorithm. In Line 20, α is a constant number for the dependency of s ∈ V on any v ∈ V in Theorem 1 and d σ [w] is the σ component of the shortest path distance vector from s to w (w th entry of shortest path distance vector d). Line 21 accumulates dependency of s where d π [w] is a set of penultimate vertices, but we vectorized it into a n-dimensional vector whose entry is 1 for the penultimate vertices(parents) of a vertex w on shortest paths from s to w and 0 for all other vertices. d σ is a n-dimensional vector containing the σ component of d. Then d π [w] and d σ are multiplied element-wise. So that, for each parents of w on shortest paths, dependency of s is accumulated without for loop in Brandes' algorithm. Then the result is multiplied by α constant and added to δ. With these operations, dependency of s is accumulated as described in Theorem 1. Lastly, Line 22 corresponds with Line 31-33 of Brandes' algorithm. w in Line 19 is never going to be the same as s. Hence we don't need to check the condition in Line 31 in Brandes' algorithm.
Alternatively, we can use the usual geodetic semiring with 2-tuples [6] for writing Algebraic BC -Dijkstra-Brandes algorithm. Let us denote it as ABC-DB GS2. We can find the penultimate vertices set, π, the third component of 3-tuple geodetic semiring based on shortest path distances resulted by Dijkstra's algorithm using the usual geodetic semiring. Finding π is the same as computing the shortest path tree since shortest paths are usually represented by parent information in the shortest path tree. Algebraic BC algorithms works with graphs without self-loops or 0weight cycles. In this case, computing the shortest path tree is simple. Let d be the shortest path distance vector from source node s and u, v ∈ V such that u = v. By Bellman criterion, Lemma 1, if d(u) + w(u, v) = d(v), then u ∈ π, that is, u is the penultimate vertex on a shortest path from s to v. Similarly, we can find all penultimate vertices for v. For all u ∈ V , π(v) = argmin u =v {d(u) + w(u, v)}. This can be expressed algebraicly as which means, add d elementwise to every column of A and take argmin for each column [17]. ABC-DB GS2 is given in Algorithm 5.

Algebraic BC -Bellman-Ford (ABC-BF)
We want to write algebraic betweenness centrality algorithm based on algebraic Bellman-Ford shortest path algorithm (Algorithm 2) in geodetic semiring. Then we need to back propagate centrality scores. But it is not possible to know the order of node traversal when using algebraic Bellman-Ford. Hence, we need to back propagate betweenness centrality scores differently than the method used in Brandes's algorithm. Methods such as keeping track of the levels of shortest path tree, similar to the betweenness centrality matrix formulation in [17] don't work on weighted graphs. We used the method from [23] that is based on partial centrality factors from [22]. Basically we are reproducing their work, but our algorithm details differ from theirs in terms of algorithm details. For example, we used the usual geodetic semiring for Bellman-Ford instead of a monoid, employed mask to distinguish current frontiers. Partial centrality factors is defined as follows.
Then, we can rewrite betweenness centrality in terms of parial centrality factors as below.
Equation 16 is a restructure of the recursive relation of dependency of the source node, s. When back propagating betweenness centrality, we need to make sure that only finalized partial centrality factors can be back propagated and repeat the process till partial centrality factors are finalized for all vertices. Once partial centrality factors of the source node, s is computed, we accumulate betweenness centrality as in Equation 17. This process doesn't use the parent information from 3tuple geodetic semiring. So we write the algorithm using usual geodetic semiring with 2-tuples [6]. Let (d, p, c) be a 3-tuple representing shortest path distance, partial centrality factor and number of children on the shortest path tree. [23] defined addition and multiplication operations on such 3-tuples and used it for back propagating centrality scores.
where ⊗ is not associative. Hence, it is not a semiring. [20] proposed a new centrality measure called percolation centrality which is more useful in network percolation scenarios such as disease transmission in a network of cities, spreading of infection or a virus over a social network of individuals. Percolation centrality is defined based on shortest paths counts, similar to betweenness centrality, but it accounts for the changing percolation states of vertices. Hence, our ABC-DB algorithm 4 can be easily extended to compute percolation centrality. Time slices of a static network with a varying node property, called percolation state is used in percolation centrality and as defined in [20], x t i is the precolation state of a node i at time t where x t i ∈ [0, 1] and its value indicates different percolation states. If x t i = 0, it is a non-percolation state at time t. If x t i = 1, it is a fully percolated state at time t. If 0 < x t i < 1, it is a partially percolated state. Then based on percolation status of nodes, a concept called a percolated path is introduced. A path starting from a percolated node and ending at a percolated or non-percolated, or partially percolated node is called a percolated path. The percolation centrality of a node v at a given time t is defined as the proportion of percolated paths passing through that vertex and it is formally defined as  while flag is true do 24:

Algebraic Algorithm for Percolation Centrality
where σ sr is the number of shortest paths between s, r ∈ V , σ sr (v) is the number of shortest paths between s, r ∈ V that go through v and the fraction is called the relative contribution of a percolated path starting from s to the percolation centrality (it is denoted as w t sv in [20]). The sum [ x t i ] is the total percolation in the network and the percolation state of v at time t, is subtracted from the sum for normalization purpose. Now we want to modify dependency of a vertex, Definition 1.5 and define percolation dependency of a vertex.
is independent of r. i.e the portion of percolated shortest paths starting from s and going through v such that Then by Theorem 1 where P s (w) is the set of parents of a vertex w on shortest paths from s to w. Hence, the percolation dependency of a vertex also follows a recursive relation and we can compute percolation centrality at a given time using Brandes' algorithm. Algebraic Percolation Centrality (APC) algorithm is presented below and the blue lines are the extension parts that differs from or added to ABC-DB, Algorithm 4.

Algorithm 7 Algebraic Percolation Cenrality (APC)
Input: A (n × n adjacency matrix); x 1 , x 2 , . . . , x λ (Percolation states of nodes at each time step) Output: C P (Percolation Centrality, an n × λ matrix) 1: C P = 0 2: for s ∈ V do 3: for t = 1 to λ do  end for 27: end for 28: C P = 1 n − 2 C P In Algorithm 7, we compute percolation centrality of each node at each time slice. Since the network is static but percolation states vary over time, a single adjacency matrix A and percolation states if each nodes at different time steps (can be inputed as a n × λ matrix) are given where λ is the number of time slices. In Line 1, C P is a n × λ matrix for storing percolation states and its j th column corresponds to percolation centralities of all n nodes at time j th time. For each source vertex s and for each time step, we compute the shortest path tree rooted at s at time t and we start from leaf nodes of the tree and back track vertices in the reverse order of traversal. During this process, we accumulate percolation dependency of s (modified with the relative contribution) on all relevant nodes and update the corresponding percolation centrality of the current node at the corresponding time. Most part of the algorithm is the same as ABC-DB 4, we focus on explaining the extension parts in blue. Line 4 computes the relative contribution of a percolated path starting at s as described in Equation 21 where x t is the percolation states of nodes at time t and c t s is a n-dimensional vector whose v th entry is c t sv , the relative contribution of a percolated path starting at s and going through v, to the percolation centrality of v at time t. Line 24 accumulates percolation centrality according to Equation 23.

Implementation
The algebraic algorithms introduced in this paper all have time complexity bounded by that of matrix products. In many cases there exist algorithms that perform better on serial computers that are not expressed algebraically. These algorithms often exploit the sparsity of the graph adjacency matrix and avoid the cubic complexity of direct operations expressed in terms of products of the dense adjacency matrix. On the other hand, these serial algorithms often are difficult to parallelize due to data structures or algorithmic choices that induce operational dependencies that reduce opportunities for concurrent execution. The use of priority queues are a common example of this. Algebraic graph algorithms are based on matrix-matrix or matrix-vector multiply and can leverage known methods for implementing these algebraic primitives in parallel (and potentially using sparse data structures where appropriate). In this paper we implemented the algorithms that we define to study their correctness, performance, and scalability on small-scale shared memory parallel systems.
The starting point of the implementation is the fundamental algebraic structure, the 3-tuple semiring. Early experiments in implementation used the relatively new Julia language [7] because it allows us to define a user-defined type for the semiring and overload the usual +, * operations to use our custom semiring operations ⊕, ⊗ respectively. This has the appealing property that the implementation of the algorithms will be concise and will closely resemble the pseudo-code used to define the algorithm in the paper. Unfortunately, we encountered a number of issues in achieving high performance in Julia that often required the addition of explict types, manually specialized functions, and other change that ultimately lowered the abstraction level of the code. We found that implementing the algorithms in Python with Numpy for arrays and Numba for just-in-time compilation to parallel code was preferable.
One example of an efficiency issue that we encountered was mapping the 3-tuple geodetic semiring to a specific data structure. In particular the third component, π of 3-tuple geodetic semiring is a set structure which requires set union operation. We need to have highly optimized, set-like operation to get good performance. The overhead of a tree-based set data structure proved to be a major performance bottleneck. We implemented Algebraic BC -Dijkstra-Brandes (ABC-DB GS3), Algorithm 4 using dense arrays. For example, adjacency matrix of a graph in the 3-tuple geodetic semiring is split into three separate dense arrays where A d is the usual adjacency matrix of edge weights, A σ is a boolean matrix representing the number of shortest paths between edges, A π is also a boolean matrix such that nonzero entries of i-th row vector represents the penultimate vertices. This representation trades space for efficiency in performing the set operations for the π component.
For a shortest path distance vertor d in 3-tuple geodetic semiring, it is represented by 2 separate vectors and a boolean matrix where d is the shortest distance vector from s to other vertices, σ is a vector of the number of shortest paths from s to other vertices and π is a boolean matrix in which nonzero entries of i-th row vector represents the penultimate vertices of the shortest paths from s to the vertex i. We defined functions corresponding to operations in 3-tuple geodetic semiring such as elementwise min (⊕ = min) and elementwise add ( = + rhs ) required in ABC-DB.
Similarly, we also implemented ABC-DB GS2, Algorithm 5 using dense arrays. It differs from ABC-DB GS3 by working with 2-tuple geodetic semiring and finding penultimate vertices after finding shortest path distances in geodetic semiring.

Experiment
We choose Algebraic BC -Dijkstra-Brandes algorithm for our experiment since it is relatively simpler than Algebraic BC -Bellman-Ford that we don't need to do extra computation for finding the frontiers when back propagating centraliy scores. Particularly, we choose ABC-DB GS2, Algorithm 5 for our experiment. This eliminates the need to implement set operations required by GS3, thus simplifying the storage and time requirements of the algorithm. Our experiments showed that the GS3-based algorithms performed very poorly compared to those using GS2, and profiling results pointed at the set data structures and corresponding operations as a cause. We implemented ABC-DB in Python and optimized it with Numba, an open source JIT compiler that translates a subset of Python and NumPy code into fast code via the LLVM compiler [18]. Numba can produce optimized serial code, multicore parallel code, as well as kernels for execution on GPU accelerators. We run serially ABC-DB with Numba's nopython option and we also parallelized it with Numba's parallel option with 20 cores. We used the Washington State University Kamiak HPC cluster provided by Center for Institutional Research Computing (CIRC) [3]. We experimented on a single node which has 256 GB memory and two 10-core Intel Xeon CPUs (Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz) running the CentOS Linux 7 operating system. We compare the performace of ABC-DB GS2 with NetworkX [16], a commonly used python package for network algorithms. We used Erdös-Rényi graph randomly generated by NetworkX with 0.5 probablity of edge creation. Both serial and parallel ABC-DB GS2 is faster than NetworkX for Erdös-Rényi graphs refer to Figure 1. However, when we experiment with GRENOBLE set of sparse unsymmetric graphs from SuiteSparse Matrix Collection [11], serial ABC-DB GS2 is slower than NetworkX and NetworkX performs almost as good as parallel ABC-DB GS2 shown in Figure 2. ABC-DB GS2 scales well in Figure 3 and its scability is stable as graph size increases in Figure 4. The performance of NetworkX versus our Numba accelerated algorithms shows what we expect: • For very sparse graphs, the Python code with better asymptotic complexity outperforms the linear algebraic algorithms even when they are compiled. This is due to the use of dense matrices in the NumPy/Numba implementation.
• For less sparse graphs, the optimized linear algebra based methods scale better as the graph size grows. This likely is due to the overhead of the interpreted Python code in NetworkX and less wasted work in the dense matrix methods used by our code.
• The benefits of parallel execution of the algebraic methods is a function of the graph size. For small graphs the benefit of parallelism is exhausted with small core counts, but as the graphs grow we see speedups that benefit from ever larger core counts. As showin in Figure 4, for small graphs the scaling stops at 4 cores, but continues up to and beyond the maximum core count of our testbed for larger graph sizes.      These basic experiments show that our methods implementing algebraic BC show the same benefits as other graph algorithms expressed via linear algebra: they map easily onto parallel architectures using known parallelization methods, and thus can exploit parallel architectures for processing large graphs that are beyond the capability of single processor systems. We expect that implementing them using existing parallel sparse linear algebra libraries will allow our BC algorithms to scale further for large sparse graphs.

Conclusion
We presented a 3-tuple geodetic semiring (GS3) and used it to write algebraic betweenness algorithms (ABC). Among different variants, we find ABC based on Dijkstra's and Brandes' algorithm (ABC-DB) is better choice than the other options such as ABC based on Bellman-Ford and back tracking frontiers. We revealed the problem of implementing ABC based on GS3 due to the set nature of parent information in GS3. Hence, we proposed an alternative version of ABC-DB algorithm using GS2. We implemented it using dense arrays to use numba's nopython option and auto parallelization. Our experiment result shows that ABC-GS2 is suitable for medium size denser graphs and it scales well as we increase the number of cores. The future work is to implement ABC based on sparse matrices. Furthermore, designing a specific data structure to efficiently handle complex structures such as GS3 is an open problem. Since we can use GS2 and compute parent information at the end in ABC-DB, GS3 may not be the first choice for implementing ABC, but there could be different graph algorithms which utilizes GS3 better using parent information carried by GS3 in every iterations or one could propose a new graph algorithm based on GS3.