Forbidden versus permitted interactions: Disentangling processes from patterns in ecological network analysis

Abstract Several studies have identified the tendency for species to share interacting partners as a key property to the functioning and stability of ecological networks. However, assessing this pattern has proved challenging in several regards, such as finding proper metrics to assess node overlap (sharing), and using robust null modeling to disentangle significance from randomness. Here, we bring attention to an additional, largely neglected challenge in assessing species’ tendency to share interacting partners. In particular, we discuss and illustrate with two different case studies how identifying the set of “permitted” interactions for a given species (i.e. interactions that are not impeded, e.g. by lack of functional trait compatibility) is paramount to understand the ecological and co‐evolutionary processes at the basis of node overlap and segregation patterns.

regardless of specific hypotheses need to be provided. Despite Chen's claims, the criteria provided by SV are correct, and the 'errors' pointed out by Chen derive from his misinterpretation of some basic network properties. The two alternate network configurations presented by Chen in his Fig. 1A and Fig. 1B actually belong to two different networks. In fact, the degree of each node in Fig. 1A is equal to 1, while the degree of nodes in Fig. 1B is 2, doubling the overall network degree (loops always have degree equal to 2, Gross & Yellen 2005, pp. 8-9).
Thus, in this specific case, the adjustment originally suggested by SV is necessary, i.e. the two nodes should be excluded from the computation of overlap probability, and n should be replaced by n-2 (SV, p. 909). Consequently, SV's equation will give the same probabilities given by Chen (despite his incorrect representation of possible configurations for the two-nodes network). In particular, the exclusion of the two nodes from the computation and the adjustment of n to n-2 will give (according to SV) p(k|n,d1,d2) = p(k,0,0,0), that will be equal to 1 for k=0, and equal to 0 for k=1.
The situation is different for a directed network such as that in Chen's Fig. 2. SV clearly stated that Ɲij must be evaluated separately for in and out-going links. This means that when considering outgoing links, one needs to account for alternative configurations of the network independently from how these would affect node in-degree. Thus, the probability of node sharing is evaluated focusing just on the in-or on the out-degree of nodes separately, meaning that, when focusing on incoming links, all the configurations keeping the in-degree (but not necessarily the out-degree) should be considered (and vice versa when considering outgoing links).
Chen briefly mentions in his conclusions that this approach could be a simple solution to the fact "the probability of observing k shared neighbors for a pair of species in directed networks with self-loops "may be more complicated because of the involvement of directions. It should be related to indegree and outdegree of nodes". Indeed, the separate computation of Ɲ and Ɲ described in SV is a key aspect of the method, not only from a computational perspective, but also from a conceptual one. It is intuitive that the overlap in plant species used by two pollinator species is fundamentally different from the overlap in pollinators servicing two plant species. But the same is true also for non-bipartite networks, due to the obvious differences in the processes represented by edge direction (e.g., consuming or being consumed in food-webs, buying or selling in trade networks, departing or arriving in transport networks) (Newman & Park 2003).
Separating in-and out-degree, the possible configurations of the directed network in Chen's Fig. 2 are those we show here in Fig. S1, which makes the probability of 0.5 given by SV's formula correct.
Completing the computation of Ɲ for the directed network of Fig. 2, we will have: Since 1 + 2 − = 0 then: and hence: The same applies to Ɲ , . Therefore, we have Ɲ = -1, which makes sense, given that the network is completely segregated. Note that Ɲ would have been equal to 1 for a perfectly nested network, such as one of the alternative configurations with one node shared, for example {(1,2); (2,2)}.
Interestingly, Chen applies the SV method correctly to bipartite networks of species distributed among sites as in his Fig. 3. For the overlap of sites across species, he obtains a probability value of overlap identical to that resulting from SV's approach. However, for species-site networks, focusing only on species constitutes just half of the computation of network structure. Chen seems to consider the overlap in species per site as irrelevant.
As an example to verify the correctness of his equations, Chen computes the probability of sharing 0 or 1 nodes for two nodes having degree 1 and 2 on a network having 4 nodes in total (which he identifies as n). The example, however, is a singular choice, since produces the same results as the 'incorrect' SV procedure.
The analyses of simulated networks provided by Chen suggest that there could be other serious issues in his implementation of the SV procedure. This is particularly evident in the analyses conducted on 'perfectly segregated' food webs, given that Ɲ should be equal to -1 for all of them, while Chen shows this metric varying from -1 to 0 in his analyses (the same applies to Chen's analysis of perfectly nested food webs).
Chen further claims (incorrectly) that his equation and that of SV give very different quantification of node overlap in nested networks. Again, his error is possibly due to the same implementation problems we highlighted above. To demonstrate the basic equivalency of Chen's equation 6 and the SV method, we have replicated the analysis on the same set of food webs used by SV (Cohen 2010), using both the SV method and Chen's method (Python code for both SV's and Chen's implementation of Ɲ is provided as Supporting Information). The two approaches produce very similar results ( Fig. S2A; R 2 = 0.97; intercept = 0.04; coefficient = 0.95). We also found that the (always small) discrepancies between the two methods tend to further decrease with increasing network size (both in terms of nodes and edges) (Fig. S3). Thus, if we remove from the analyses very simple and unrealistic networks (i.e., those having less than 30 edges), we find that the two methods produce virtually indistinguishable results ( Fig. S2B; R 2 = 0.999; intercept = 0.02; coefficient = 0.97). We have also compared the two methods on 1000 random (Erdős-Rényi) networks, finding, again, no appreciable difference (R 2 = 0.99; intercept = -0.004; coefficient = 0.98). The lack of a difference occurs because Equation 6 of Chen is unnecessarily complex; it gives the same probability as the SV probability equation except when d1, d2, k, and n are all very small (e.g., < 10) and thus represent an unrealistically (or uninterestingly) simple network. We also point out that Chen's Equation 6 cannot be solved when k = d1 or k = d2 because this would involve negative numbers in some of the combinatorics terms.
Finally, Chen's analysis on the effect of generalist species on the discrepancy between SV method and Chen's method raises serious doubts. Letting alone the lack of ecological realism in generating food webs with 50 nodes (we think Chen refers to this measure, when referring to size) and a connectance of 'around' 0.9 (i.e. almost fully connected graphs), we wonder how it is possible to create networks with fixed connectance and size, while varying the number of generalists (which Chen defines as nodes connected to "all the remaining species/nodes" in the network).
The SV method, although providing a summary statistic for a given network, relies on and summarizes pairwise comparisons between species and/or sites, as in the case of the popular NODF nestedness metric (Almeida-Neto et al. 2008). This provides it much freedom to test complex null hypotheses by identifying permitted and forbidden species interaction links according to different ecological criteria and on a perspecies basis. Besides technical issues such as those we highlighted here, the identification of a 'universal' rule controlling all pair comparisons (as the n adjustment proposed by Chen) goes against this flexibility, that is where the true potential of our method lies. Fig. S1. Different configurations of the directed network in Chen's Fig. 2 not altering, respectively, outdegree (A) and in-degree (B). k indicates the number of shared nodes. Note that the average node overlap is equal to 0.5 in both cases.   S3. The discrepancy (measured as difference) between SV's and Chen's implementations of Ɲ approaches rapidly 0 as network size increases.