Multilayer Analysis and Visualization of Networks

Multilayer relationships among and information about biological entities must be accompanied by the means to analyze, visualize, and obtain insights from such data. We report a methodology and a collection of algorithms for the analysis of multilayer networks in our new open-source software (muxViz). We demonstrate the ability of muxViz to analyze and interactively visualize multilayer data using empirical genetic and neuronal networks.

In the last two decades, scientific understanding of life and disease has benefited significantly from the modeling of biological interactions in terms of networks [1][2][3][4][5][6]. Connections among genes, proteins, neurons, and other biological entities can indicate that they are part of the same biological pathway or exhibit similar biological function. The formalism of networks focuses on connectivity and has now become a paradigmatic way to investigate the organization and functionality of cells, synaptic connectivity, and more [7][8][9][10][11].
In parallel, a large variety of computational techniques have been developed to analyze (and visualize) networks and the biological information that they encode. Such methods have become important tools for attempts to understand and represent cell functionality, and they have repeatedly yielded meaningful biological insights [12]. However, although the standard network paradigm has been very successful, it has a fundamental flaw: it forces the aggregation of multilayer information to construct network representations that include only a single type of connection between pairs of entities. This can lead to lead to misleading results, and it is becoming increasingly apparent that a more complicated representation is necessary (see the review by Kivela et al [13] and references therein).
The increasing use of more complicated network representations has yielded a new set of challenges: how should one visualize, analyze, and interpret multilayer biological data. For instance, in a recent study, the genetic and protein-protein interaction networks of Saccharomyces cerevisiae were investigated simultaneously [11] to uncover connection patterns. In another example, Costanzo et al [11] reported that genetic interactions have an overlap of 10-20% with protein-protein interaction pairs, which is significantly higher than the 3% overlap that they expected based on a random null model. This suggests that many positive and negative interactions occur between -rather than within -complexes and pathways [11] and thereby gives an important example of how exploiting multilayer information might improve understanding of biological structure and functionality.
Although the aforementioned overlap is an indication of correlation between a pair of networks, the analysis of multilayer biological data would benefit greatly from techniques and diagnostics that are able to exploit multiplexity (i.e., multiple different ways to interact) in available information. Recently, a novel mathematical framework to model and analyze multilayer relationships and their dynamics was developed [14,15]. One represents the underlying network topology and interaction weights as a multilayer network, in which entities can exhibit different relationships simultaneously and can exist on different "layers". Multilayer networks can encode much richer information than what is possible using the individual layers separately (which is what is usually done). This, in turn, provides a suitable framework for versatile, sophisticated analyses that have already been used successfully to reveal multilayer community structure [14] and to measure important nodes and the correlations between them [15][16][17]. However, to meet the requirements of an operational toolbox to be applied to the analysis of biological systems, it is of paramount importance to also develop a system to visualize multilayer networks and represent the results of analysis in a meaningful way.
The primary contributions of the present work are to address the computational challenge of analysis and visualization of multilayer information by providing a practical methodology, and accompanying software that we call muxViz , for the analysis and the visualization of the salient features of multilayer networks. In Supplementary Notes 5 and 6, we give technical details about the muxViz software and its graphical user interface, together with a few representative examples of analysis of multilayer biological networks (see Supplementary Note 4). In multilayer networks, nodes can exist on several layers simultaneously and counterpart nodes from different layers are connected to each other via inter-layer edges. One can visualize a multilayer network in muxViz either using explicit layers or as an edge-colored multigraph [13,18], in which edges are "colored" according to the different type of relationships between them (see Supplementary Figure 1 for applications to genetic and neuronal multilayer  networks).
To demonstrate the ability of muxViz to analyze and visualize multilayer networks, let's consider different types of genetic interactions for organisms in the Biological General Repository for Interaction Datasets [19] (BioGRID, thebiogrid.org), a public database that archives and disseminates genetic and protein interaction data from humans and model organisms. BioGRID currently includes more than 720,000 interactions that have been curated from both high-throughput data sets and individual focused studies using over 41,000 publications in the primary literature. We use BioGRID 3.2.108 (updated 1 Jan 2014). In the present paper, we focus on Xenopus laevis and show a network visualization in One can examine the global organization of nodes into modules (i.e., "communities") through an algorithmic calculation of community structure [14]. For example, one can obtain dense communities in multilayer networks by optimizing a multilayer generalization of the modularity quality function [14]. To do this, one takes into account both intra-layer and inter-layer edges, and one seeks densely connected sets of nodes (i.e., communities) that are sparsely connected to each other as compared to some multilayer random-graph model [14]. See Supplementary Note 4 for a visualization of communities in Xenopus laevis and other organisms.
One can quantify the importance of a node by using various diagnostics to measure "centrality". One calculates such a centrality (and a corresponding rank order) for each node by using multilayer generalizations of centrality measures [13,15,16]. The software muxViz has tools for calculating several different types of centrality (e.g., degree, eigenvector, hub and authority, PageRank, and Katz) either for an entire multilayer network or for each layer separately. As we illustrate in Figure 1B, centrality values (as well as other network measures) can be very different in multilayer networks than in their corresponding aggregations. Such results influence how one should interpret calculations of network measures for, e.g., which genes or proteins are most important for activating or suppressing a given biological processes. The data in question is multilayer, so the analysis of such data must take multilayer features into account.
Researchers are often also interested in considering a "reduced version" of multilayer data sets that preserve as much information as possible without altering the primary descriptors. For such scenarios, it is possible to use dimensionality reduction to identify the layers of a multilayer network that are providing redundant information [20]. For example, one can calculate a pairwise distance between layers and can in turn hierarchically cluster the layers using this distance. (As explained in Supplementary Note 2, users can choose among several clustering methods.) One then merges the layers and represents the merging process as a "reducibility dendrogram" like the one in Figure 1C. The muxViz software controls this procedure via a quality function that guarantees the merging of redundant layers with minimum loss of information with respect to the full multilayer representation (see Supplementary Note 2). Naturally, one can also use other ways of preserving information in such a reduction process.
In Figure 1D, we show degree-degree Spearman correlation coefficients between layers to quantify the tendency of nodes to be hubs in different layers simultaneously. The muxViz software also includes additional correlation measures (see Supplementary Note 4), and it is easy for users to implement other indicators [17].
To summarize all of the information that one obtains from calculations like the ones above in a compact figure, we developed an annular visualization that facilitates the ability to capture patterns to deduce qualitative information about multilayer data. In Figure 1E, we show an example for centrality diagnostics, which measure the importance of nodes in various ways. Each ring indicates a centrality measure, and the angle determines the identity of a node in a network, regardless of the layer(s) in which it exists. In this visualization, we have binned the centrality values, and the color indicates the value. To maximize the readability of the annular plots, we adopted several criteria are adopted (see Supplementary Note 3), although users are free to choose custom options. One can use the same principles when fixing some centrality descriptor and letting the rings correspond to the layers in a network, the multilayer network, and an aggregated network (see Figure 1F, where we consider strength centrality). For the case of layers, one calculates centrality for each layer separately without accounting for multilayer structure. For instance, it is evident that rings 3 ("DirInt" layer) and 5 ("PhAssoc" layer) are negatively correlated because nodes tend to have opposite colors, whereas rings 6 (aggregated network) and 7 (multiplex network) are positively correlated, as expected for strength centrality. Our annular representation makes it easy to see similarity (or dissimilarity) in rank orderings according to different diagnostics. For example, their patterns reveal that physical association and direct interaction are dominant and determine the multilayer strength. In other cases (see Supplementary Note 4), the ranking by some centrality measure in the multilayer network is poorly correlated to the ranking in either an aggregated network or in individual layers separately. This underscores the importance of using a multilayer framework for the calculation of the most central proteins.
In the current era of "big data", there is now an intense deluge of multilayer data. To avoid throwing away important information or obtaining misleading results, it is increasingly crucial to use methods that exploit multilayer structure. In this paper, we present new software and associated methodology that exploits the new paradigm of multilayer networks, and we illustrate how it can be used to analyze and visualize multi-relational genetic networks. Our software, muxViz , provides an open-source framework for multilayer analysis. Additionally, the modular structure of muxViz -along with its open-source license -makes it easy to add new methods for analysis. Moreover, although we have illustrated the power of muxViz for the analysis of biological networks, it clearly is also useful for multilayer networks from any other setting. As we illustrate in Supplementary Figure 10, it can even be overlaid over spatial information.
All authors were supported by the European Commission FET-Proactive project PLEXMATH (Grant No. 317614). AA also acknowledges financial support from the Generalitat de Catalunya 2009-SGR-838, the ICREA Academia, and the James S. McDonnell Foundation. MAP acknowledges a grant (EP/J001759/1) from the EPSRC. We thank Serafina Agnello for her support with graphics. In the last two decades, the analysis of complex systems has benefited significantly from the modeling interactions and relationships using networks [21]. This has led to significant improvements in scientific understanding of numerous phenomena -including genetic and metabolic processes [7-9, 11, 22-26], neuronal connectivity of living beings [27][28][29][30][31][32][33][34][35][36], and more.
However, despite these many successes, the framework of ordinary networks-which encompass only one type of connection between each pair of entities and which are concerned with static snapshots or aggregations of networks-leaves out crucial information that can lead to misleading or even incorrect results. As a result, the much richer framework of multilayer networks [13] has attracted an ever-increasing amount of interest among scientists. Multilayer networks make it possible to analyze much more complicated systems in a realistic way, as now it is possible to encompass both temporal dynamics of relationships and multiple types of relationships in a natural way so that one can throw away much less information than is required in the usual network framework. Multilayer networks have already yielded fascinating insights and are experiencing burgeoning popularity [13]. For example, there have been numerous studies to attempt to understand how interdependencies (e.g., [37, 38]), multilayer structures (e.g., [39, 40]), dynamics (e.g., [41, 42]), and control (e.g., [43]) can improve understanding of complex interacting systems. See the recent review article [13] for extensive discussions and a thorough review of results.
The muxViz software focuses predominantly on multiplex networks, which refer to networks with multiple types of relational types and are arguably the most important (and prevalent) type of multilayer network. A large variety of real-world systems in the biological, social, information, physical, and engineering sciences can be described as multiplex networks. In such a network, a node can exist on different layers simultaneously, and a node in a given layer is connected to its counterparts on other layers via inter-layer edges. In muxViz , we consider two different types of inter-layer connectivity: ordinal and categorical. In ordinal multilayer networks, inter-layer edges exist only between layers that are adjacent to each other with respect to some criterion (e.g., temporal ordering). By contrast, categorical multilayer networks include inter-layer edges between counterpart nodes from every pair of layers. For the sake of simplicity, we illustrate muxViz using inter-layer edges of weight 1 in this paper. In general, how to choose such weights is an open research question. See the discussion in Ref. [44].
For instance, let's examine the genetic-interaction and [The data comes from Ref. [10].] In panels B and D, we color the nodes according to the layer to which they belong. If a node is part of multiple layers simultaneously, then we use an equal distribution of the corresponding colors for the node.
profile-correlation networks of a cell, which were aggregated into a single network in Ref [11], as different layers of a multilayer network. In Supplementary Figure 1A, we show multilayer visualizations that we created using muxViz . Other representations are also possible [13]. For example, when representing this data as an edgecolored multigraph, we "color" edges according to the type of relationship that they represent (see Supplementary Figure 1B). In Supplementary Figure 1C and Supplementary Figure 1D, we show the two visualizations for the connectome of Caenorhabditis elegans. In this example, each layer corresponds to a different type of synaptic connection [10].
In panels (A) and (C) of Supplementary Figure 1, we use a layout in which the positions of the nodes are the same in each layer. We determine the positions of nodes by combining two of the standard force-directed algorithms available in muxViz and applying them to an aggregated network that we obtained by summing the corresponding entries of the adjacency matrices of the individual layers. Specifically, we first apply the Fruchterman-Reingold algorithm [45] to the aggregated network and then use the output of this algorithm as a seed layout for the Kamada-Kawai algorithm [46] to achieve a better spatial separation of nodes in the final layout. The muxViz software also allows other layout choices: for example, the layout of each layer can be independent, or one can use any individual or aggregation of any subset of layers to determine node locations.

Supplementary Note 2: Dimensionality reduction and reducibility dendrogram
An important open question is the determination of how much information is necessary to accurately represent the structure of multilayer systems and whether some layers can be aggregated without incurring loss of information. It was shown recently that it is possible to reduce the number of layers in multilayer networks in a way that minimizes information loss by using an information-theoretical approach [20]. The methodology of Ref. [20], which we implemented in muxViz , amounts to a tradeoff (which is "optimal" in some sense) between accuracy and complexity. Alternatively, users of muxViz can implement alternative methods based on different interpretations of "minimal loss information".
The reduction procedure from Ref. [20] proceeds as follows. For each pair of layers in the original multilayer network, muxViz calculates the quantum Jensen-Shannon divergence [47]. This estimates the similarity between two networks based on their Von Neumann entropy [48]. By definition, the quantum Jensen-Shannon divergence is symmetric and its square root, which is usually called the Jensen-Shannon distance, satisfies the properties of a metric [49]. The JS distance can be used to quantify the distance in terms of information gain/loss between the normalized Laplacian matrices associated to two distinct networks [20].
One places the distances between every pair of layers as the components of a matrix, so one can then perform hierarchical clustering [50] using any desired method to produce a dendrogram that indicates the relatedness of the information in the different layers. In muxViz , we have included several methods for hierarchical clustering (e.g., Ward, McQuitty, single, complete, average, median, and centroid linkage). We show an example of such a "reducibility dendrogram" in panel (D) of Supplementary Figure 3. A reducibility dendrogram merges a set of layers in a multilayer network step by step, and we calculate a quality function based on the relative Von Neumann entropy to estimate information gain (or loss) at each step [20]. To obtain a reduced version of the original multilayer network, we stop the merging procedure at the level of the hierarchy that maximizes the relative entropy.

Supplementary Note 3: Annular visualization of multivariate information
It is a challenging problem to represent, visualize, and analyze the wealth of information encoded in the multilayer structure of networks in a compact way. Preserving more information by using multilayer networks rather than ordinary networks then complicates the visualization and analysis even further. However, this complication is necessary, because otherwise one might end up with misleading or even incorrect results [13]. We developed the muxViz software to help address these challenges. To summarize all of the information obtained from multilayer-network calculations in a compact figure, muxViz includes an annular visualization that facilitates the ability to capture patterns and deduce qualitative information about multilayer data.
To give a concrete example, many researchers are interested in ranking the relative importance of nodes (and other network structures), which traditionally is accomplished using various "centrality" measures. Centralities have been calculated for single-layer networks for several decades, and numerous notions of centrality are now also available for multilayer networks [13,16]. It is therefore necessary to develop visualization tools that make it possible to compare such a wealth of diagnostics to each other in a compact, meaningful way. For example, it is often of interest to focus attention on one descriptor and compare the values obtained in each layer separately to the values obtained from the multilayer network and its aggregations. The muxViz software makes it easy to do this.
We will now illustrate the annular visualization (e.g., see Supplementary Figure 5) using the example of multilayer centrality measures. Suppose that we have different arrays of information, where you should think of each array as having resulted from the calculation of some centrality diagnostic on a multilayer network. We visualize each array in a ring. The angle indicates node identity (regardless of the layer or layers in which it occurs). We bin the centrality values-e.g., either linearly or logarithmically-and we assign a color to each bin to encode its value. Both the type of binning and the color scheme are customizable in muxViz . We place the rings concentrically, and one can determine both their order and their thicknesses according to any desired criteria. For example, in the visualizations in the present paper, we determine the thickness of each ring according to its information content (which we quantify using the Shannon information entropy of the distribution of the values): thinner rings have less information. Users can customize the order of the rings, but as a default it is determined automatically via hierarchically clustering. The muxViz software calculates a measure of correlation (e.g., Pearson, Spearman, or Jensen-Shannon divergence) between each pairs of descriptors to obtain a set of pairwise distances, which we then hierarchically cluster to group the rings. This clustering procedure determines the order of the rings to try to maximize the readability of the annular plot.
One can also use the same principles when fixing some centrality descriptor and letting the rings correspond to the layers in a network, the multilayer network, and an aggregated network. Such a plot might help to reveal, for instance, if the centrality of nodes in a multilayer network is primarily due to their centrality in a specific layer or if the aggregated network is a good proxy for the multilayer structure.

Supplementary Note 4: Example analyses of empirical multilayer networks
In this note, we present multilayer analyses of four biological systems to illustrate the power of muxViz . We examine the following examples: • Xenopus laevis genetic-interaction network (see Supplementary Figures 2 and 3); • Caenorhabditis elegans connectome (see Supplementary Figures 4 and 5); • Herpes simplex genetic-interaction network (see Supplementary Figures 6 and 7); • HIV-1 genetic-interaction network (see Supplementary Figures 8 and 9).
We include two figures for each example. In the first set of figures (see Supplementary Figures 2, 4, 6, and 8), we show the following information: • Panel A: Multilayer community structure from modularity maximization [14]. The color of each node encodes its community assignment in a multilayer-network visualization. For comparison, we also show the results (and corresponding visualization) of community detection on an aggregated network, which we obtain by summing the corresponding intra-layer edge weights of all layers. (In other words, if A ijs gives the edge weight between nodes i and j on layer s, then we obtain an aggregated edge weight W ij between nodes i and j by summing over s.) • Panel B: Multilayer centrality [16]. We again use a multilayer-network visualization. We label the top five nodes from a ranking according to multilayer PageRank centrality. For comparison, we also show the results of centrality calculations on the aforementioned aggregated network.
• Panel C: Edge-colored multigraph visualization of the network. We color edges according to the layer to which they belong. We color the nodes according to their layer (or layers); if a node exists on multiple layers, then we distribute its corresponding colors evenly.
• Panel D: Dimensionality-reduction analysis and corresponding reducibility dendrogram [20]. We show the distance matrix and the corresponding dendrogram, which we obtain using Ward hierarchical clustering.
• Panel E: Measures of correlation between layers: (left) mean edge overlap, (center) degree-degree Pearson correlation coefficient, and (right) degreedegree Spearman correlation coefficient.
In the second set of figures (see Supplementary Figures 3, 5, 7 and 9), we show the annular visualization for the centrality descriptors: • In panels titled "Multiplex", we consider the multilayer network. Each ring corresponds to a different centrality descriptor.
• In the other panels, we consider a specific centrality descriptor (which we specify in the title of the panel). Each ring encodes the values of that descriptor, which we calculate separately in each layer separately. We also include rings for the calculation of the corresponding centrality diagnostic in the multilayer network and in its aggregation to a single-layer weighted network.
We specify the order of the rings in the list of labels on the right of each plot. In each case (and as in Figure 1 in the main text), the top label refers to the innermost ring and the bottom label refers to the outermost ring.