Block partitioning of sparse rectangular matrices

We present a means of reordering large, sparse rectangular matrices such that their nonzeros are closer to the diagonal. This enables a block‐partitioning which is useful in parallel contexts. We use the Reverse Cuthill‐McKee (RCM) algorithm on the adjacency matrix of the associated bipartite graph. The resulting, reordered matrix has a block bidiagonal structure.


Introduction
In large scale simulations, sparse matrices appear frequently as the result of discretizations of the underlying PDE. The associated system of linear equations can be solved using the Augmented Block Cimmino Distributed method [2] which adds columns to the matrix such that the row-blocks are mutually orthogonal. To minimize the number of these additions, it is important to have the nonzeros close to the diagonal. We will describe a procedure to achieve this goal.
Let A ∈ M m×n be a sparse rectangular matrix. We can attach to it the following undirected bipartite graph G = (E, R, C), with R = {1, 2, ..., m} the set of nodes denoting row indices, C = {1, 2, ..., n} the set of nodes denoting column indices and E = {(i, j) | A i,j = 0, i ∈ R, j ∈ C} the set of edges, corresponding to matrix nonzeros. We note that the following considerations assume the graph is connected. If that is not the case, the procedure can be applied to every connected component of the graph.
The same information can be represented by the graph G = (E, V) with V = {1, 2, ..., m, m + 1, m + 2, ..., m + n}, the set of nodes denoting both row and column indices, i.e. V = R ∪ C, C = {c + m, c ∈ C} and E = {(i, j) | , i ∈ R, j = j + m, j ∈ C, A i,j = 0} the set of edges. The adjacency matrix for G is the symmetric matrix

Cuthill-McKee
We seek to reorder A such that its nonzeros are closer to the diagonal. To this end, we apply the Cuthill-McKee (CM) algorithm (see [1,Chapter 8] ) toÂ and use the result to reorder A, thus reducing its bandwidth, and giving it a particular structure. Reversing this order leads to Reverse Cuthill-McKee, a variant of the algorithm where the matrix is better suited for direct solving , as it generates less fill-in. The CM algorithm works by relabeling the nodes of an undirected graph, so it can be used on the associated, symmetric adjacency matrix. This algorithm can be thought of as a particular form of Breadth First Search, where the neighbors of each node are visited according to an increasing order of their degree, and the starting node is the one having the minimum degree. Visiting a node's neighbors implies relabeling them with the smallest unused label. For example, given a node i with neighbors p, t, q, CM will sort them in increasing order of degree, labeling them as i + 1, i + 2, i + 3, if these labels are still available. A level set is the set of nodes neighboring at least one node of the previous level set. The first level set contains only the starting node.

CM on rectangular matrices
The particular form of adjacency defined by the bipartite graph also implies that level sets alternate between sets of row indices and sets of column indices. The reordered matrixÂ isÂ R , with every row containing indices of the relabeled neighbors of the node on the diagonal. Every level set is represented by a square block on the diagonal, of size equal to the number of nodes in the set. Without loss of generality, in what follows we will assume that the first node chosen by the CM reordering corresponds to a row index. If that is not the case, the initial matrix should be transposed before and after the procedure.
Let A R be the matrix generated from the blocks ofÂ R as shown below (see also [3]), with n i , the sizes of the blocks corresponding to level sets. The matrix A R is the reordered form of A, having its nonzeros closer to the diagonal.

Results
In Fig. 1, we show a case where we reordered a random matrix generated with Matlab. After constructingÂ as in Eq. 1 and applying RCM, we can see the reordered initial matrix in the 3rd image, with its blocks in red borders. Notice the significantly narrower bandwidth. In Fig. 2, we illustrate the same procedure on a matrix from the Suite Sparse Matrix Collection 1 related to a combinatorial problem. Again, the nonzeros were brought significantly closer to the diagonal.

Conclusion and future work
In the above, we have described a procedure for reordering large, sparse, rectangular matrices such that they have a structure suitable for block partitioning. We have achieved this by applying the Reverse Cuthill-McKee algorithm on the adjacency matrix of the associated bipartite graph. Although the results are encouraging, we also notice that there is a significant difference in the size of the blocks, which might have a negative impact on performance in a parallel computing setting. Thus, a potential avenue for future developments would be to find a way of obtaining similarly sized blocks. Also, it would be desirable to implement this algorithm in an optimized fashion, such that it can be used in large scale computations.