Selective pressure on metabolic network structures as measured from the random blind-watchmaker network

A random null model termed the Blind Watchmaker network (BW) has been shown to reproduce the degree distribution found in metabolic networks. This might suggest that natural selection has had little influence on this particular network property. We here investigate to what extent other structural network properties have evolved under selective pressure from the corresponding ones of the random null model: The clustering coefficient and the assortativity measures are chosen and it is found that these measures for the metabolic network structure are close enough to the BW-network so as to fit inside its reachable random phase space. It is furthermore shown that the use of this null model indicates an evolutionary pressure towards low assortativity and that this pressure is stronger for larger networks. It is also shown that selecting for BW networks with low assortativity causes the BW degree distribution to slightly deviate from its power-law shape in the same way as the metabolic networks. This implies that an equilibrium model with fluctuating degree distribution is more suitable as a null model, when identifying selective pressures, than a randomized counterpart with fixed degree sequence, since the overall degree sequence itself can change under selective pressure on other global network properties.

3 adjust, because our null model is defined by a random process. Thus, we can obtain a network with a prescribed assortativity (or some other network structure measure) by just selecting those networks from the ensemble of networks generated by the random process. This is an appealing feature, since it does not require any ad hoc assumption about the time evolution like e.g. the preferential attachment scheme.

Results
The BW network is the null model for a network, of which one has only limited knowledge [16]: it is the most likely network structure for the given limited information. This network structure is obtained from variational calculus as the maximum entropy solution where the limited information enters as constraints [16]. The constraints for the BW network are, in general, the total number of nodes N and the total number of links M, together with the usual network constraints (no self-loops and no multiple links). In addition, it is assumed that there is neither an a priori preference as to which links are joined nor in which order. This lack of a priori preference defines the a priori randomness inherent in the BW network.
A convenient way of obtaining the variational solution is to devise a numerical algorithm that operates under the same randomness and constraints and hence converges to the same solution. However, it is important to realize that this numerical algorithm has nothing per se to do with the any actual network evolution, but is just a numerical device for obtaining the correct network structure.
An algorithm for generating the BW networks is described in [16]. It encodes equal probability for a link-end to rewire to the same node as for another arbitrarily chosen linkend, as well as equal probability for the relative order in which two link-ends arrive at the same node. This can be implemented as follows. Start from a set of links attached to a set of nodes in an arbitrary way [16].
1. Pick two nodes, A and B, randomly, each with a probability proportional to the square of their degree, p i ∝ k 2 i . 2. Pick a random link on node A and move it to node B. 3. If the attempted move is forbidden by a constraint, choose another link from the same node A and repeat until a link has been moved or until all links have been proven to violate the constraint.
The default constraints are (i) no self loop, (ii) no multiple links between nodes and (iii) no zero degree nodes. When repeated until a steady state has been reached, the algorithm creates a scale-free network with an exponent around 2. This is illustrated in figure 1(a) where the average result is shown for 107 BW networks together with 107 metabolic networks [18,19]. In the special case of metabolic networks, there is an additional, presumably chemical, constraint 2 on the number of one degree nodes (nodes with only one link). The BW network including this additional substance constraint is also shown in figure 1(b). In this figure, the expectancy values are shown (black curve), instead of a scatter plot, together with the curves corresponding to two standard deviations away from the average solution, illustrating the spread generated by the BW algorithm. The expectancy values and the corresponding spread were calculated from 1000 sets of networks, each containing 107 independent samples. In this comparison, the BW networks in figure 1(a) have the same average size and the same average number of links per node as the metabolic ensemble and, in figure 1(b), also the same average number of nodes with one link. The agreement in figure 1(b) is particularly striking. In order to further quantify the network structure, we choose the clustering coefficient and the assortativity. Both measures are represented by a single real number and we can thus further characterize a network by a point in a clustering-assortativity space. The clustering coefficient (CC) is a measure of the number of triangles existing in a network, normalized by the possible number of triangles that could exist, where N is the number of triangles (three nodes where everyone is connected to everyone) and k i is the degree of node i. A total average CC can then be calculated as where N k>1 is the number of nodes with a degree larger than one. The second measure, the assortativity, is based on the Pearson correlation coefficient, which ranges between the values −1 and 1. It is defined for networks as [21] where · · · means an average over all links and j and k are the degrees of the nodes on either side of a link. r = −1 means perfect disassortative mixing (connected nodes have very different degrees) and 1 means perfect assortative mixing (connected nodes have the same degree). The phase space of the BW model is investigated by generating many (∼10 6 ) networks and measuring their CC and assortativity. The C-r space is then discretized and a two-dimensional histogram, representing the density of occurrences, is created for each interval in C and r. The result is then plotted as contour lines corresponding to 30%, 3% and 0.3% of the maximum height, H max , of the histogram surface, and thus enclosing (as measured from the generated data) 66.7%, 96.4% and 99.6% of all the generated networks. The contours are chosen so as to correspond to a standard two-dimensional Gaussian contour plot for 1, 2 and 3 standard deviations away from the mean. Figure 2 shows the result for an ensemble of BW networks of the same size as the average metabolic network with N = 640 and k = 5.35. The points represent the metabolic networks (one point per network), which are included for comparison. The first impression of the figure is perhaps the clear correlation between the values of C and r , which seems to be shared by the real networks. Low assortativity is coupled with high clustering. It is interesting to note that this result is opposite to what was found in [20], which considered a system with a fixed degree sequence. In that case, low assortativity was coupled with low clustering. Another interesting feature is that the BW model includes a wide range of C and r values. The overlap of the BW phase space and the metabolic networks suggests that these structural features are not completely random. In addition, it shows that these nonrandom features can be obtained directly from the BW process by selection. This suggests a selective pressure towards lower assortativity and higher clustering for the metabolism of most organisms.
What happens with the degree distribution when selecting for lower assortativity? If the similarity between the degree distributions becomes dramatically worse, then the BW null model is in fact inconsistent with the metabolic networks. In order to investigate how the selection affects the BW degree distribution, we pick out realizations within a prescribed narrow assortativity interval. In figure 3, we show the average result of 62 BW networks with an assortativity in the range −0.21 < r < −0.19 with an average r of r = −0.195, together with the 62 metabolic networks that lie in the range −0.21 < r < −0.18 with approximately the same average r . Both the BW networks and the average of these metabolic networks have the size N = 730 and k = 5.5. As seen from figures 3(a) and (b), the BW degree distribution changes slightly in the tail part. More accurately, figure 3(a) shows that compared to the original BW distribution, the selection for low r creates a dip for intermediate degrees and a bump for high degrees. In fact, figure 3(b) shows that this change makes the BW distribution even more similar to that of the metabolic networks. This suggests that the selection for low assortativity is, in fact, in this case a selection of a degree distribution and not a selection within a given fixed degree distribution.
This is further illustrated in figures 4(a) and (b), where the data are plotted in the same contour plot as in figure 2. Figure 4(a) shows the overlap between the BW phase space and 62 metabolic networks with low r (circles). The triangles show the result of randomizing each of the 62 metabolic networks, keeping all nodes individual degree [22] and averaged over 100 independent randomizations. The small shift between the clouds of circles and triangles again indicates that much of the network structure (represented by the clustering and assortativity) is absorbed into the degree sequence. Figure 4(b) shows the same thing as in (a) but for 62 selected snapshots of the BW networks lying in the small r range. These networks have also been randomized, keeping the degree sequence (triangles) in the same way as in (a) and the BW null model seems to behave very similar to the metabolic networks. Furthermore, the CC for these selected BW networks is C = 0.17 compared to C = 0.16 for the corresponding 62 metabolic networks. This means that when selecting for a low assortativity, the BW networks automatically obtain a high clustering.
Metabolic networks of different organisms have very different sizes and average degrees. The number of nodes varies from about 200 for the smallest networks to almost 1200 for the largest, and the smallest average degree is about 4, whereas the largest is close to 6. This fact has not been taken into account in the analysis of figures 2-4. To study the effect on the structural measures, both for the BW null model and the real metabolic networks, we plot the assortativity as a function of N ( figure 5(a)) and as a function of k ( figure 5(b)). We only investigate the size dependence on the assortativity, since this measure is highly correlated with the clustering coefficient, and will thus give a representative behavior for both measures. Since the metabolic networks possess a wide spread in both N and k , a smaller sampling interval is used for one of them, while sampling a broader interval for the other. In figure 5(a), the assortativity is plotted as a function of the average degree for the BW networks with N = 640, together with the metabolic networks in the range 540 < N < 740. Both the BW and the metabolic networks display a decrease in the assortativity when the average degree is increased, with almost a constant shift between them. This constant shift again signals a selective pressure towards lower assortativity, which is independent of the average number of links in the system. Figure 5(b), on the other hand, shows an increase in the assortativity for the BW networks as a function of the number of nodes, whereas for the metabolic networks it is basically independent of the size. Here, k = 5.4 is used for the BW networks. In this case, the metabolic networks with an average degree in the range 5.2 < k < 5.5 show no increase with N . This means that the difference between the BW network and the metabolic network increases with N . Thus, the  results presented in figure 5(b) suggest that there is essentially very little selective pressure on the assortativity for small metabolic networks and that this pressure increases with the network size.

Discussion
In this paper, we investigate the clustering-assortativity phase space of the BW network model and compare it to the real data of 107 metabolic networks. We show that the structural network properties captured by these two network measures are directly reachable within the BW null model without any assumptions about the evolutionary path. It is also demonstrated that when selecting for the BW networks that possess the same structural properties as the real data, the resulting degree distribution is affected. Furthermore, the direction of this change appears to be towards increased similarity. This implies a coupling between the degree distribution and other structural properties such as assortativity and clustering. Thus, the small deviation between the degree distributions of the metabolic networks and the BW null model is, according to our analysis, suggested to be caused by a selection towards lower assortativity.
We also found that the clustering and assortativity measures are correlated in the same way both for the BW model and for the real metabolic networks. It was noted that this correlation is the reverse of what was found by Holme and Zhao in [20] when mapping out the possible phase space for a given, fixed, degree sequence. Our results imply that the structural properties of a network can depend on the degree distribution in a crucial way, which limits the usefulness of drawing conclusions on structural interdependences from fixed degree distributions. In this respect, a random null model, such as the BW model where degree distribution is allowed to fluctuate, gives a better starting point when trying to identify selection pressures. This point is further clarified by randomizing the metabolic networks using the Maslov-Sneppen [22] routine, which preserves the individual degrees. This randomization changes the average assortativity by only 3.3%, suggesting that the metabolic networks are random and void of any significant selection for assortativity. However, our comparison to the BW networks gives a much larger difference (33%), because the selection really causes a change in the actual degree sequence. This implies that it is important to identify an adequate null model that does not have a fixed degree sequence when searching for selective pressure in networks. We suggest that in this respect the BW network is an appropriate null model for metabolic networks.
One should, however, note that a null model such as the BW network does not per se give any information on the precise metabolic evolution: it is just the best guess for the structure you can make provided you have no knowledge of the evolutionary process, except what is imposed by global constraints. From this point of view it is the deviation between the null model and the actual metabolic network that contains the most interesting information. The fact that this deviation is small suggests that whatever the explicit metabolic evolution path might have been, it has had surprisingly little influence on the global metabolic network structure. Our results also imply that a priori identification of additional relevant global constraints for the BW model is one possible way of gaining further understanding of the overall metabolic network structure.