Conditional Random Matrix Ensembles and the Stability of Dynamical Systems

There has been a long-standing and at times fractious debate whether complex and large systems can be stable. In ecology, the so-called `diversity-stability debate' arose because mathematical analyses of ecosystem stability were either specific to a particular model (leading to results that were not general), or chosen for mathematical convenience, yielding results unlikely to be meaningful for any interesting realistic system. May's work, and its subsequent elaborations, relied upon results from random matrix theory, particularly the circular law and its extensions, which only apply when the strengths of interactions between entities in the system are assumed to be independent and identically distributed (i.i.d.). Other studies have optimistically generalised from the analysis of very specific systems, in a way that does not hold up to closer scrutiny. We show here that this debate can be put to rest, once these two contrasting views have been reconciled --- which is possible in the statistical framework developed here. Here we use a range of illustrative examples of dynamical systems to demonstrate that (i) stability probability cannot be summarily deduced from any single property of the system (e.g. its diversity), and (ii) our assessment of stability depends on adequately capturing the details of the systems analysed. Failing to condition on the structure of dynamical systems will skew our analysis and can, even for very small systems, result in an unnecessarily pessimistic diagnosis of their stability.


Introduction
The notion of stability of stationary solutions is central to the study of dynamical systems. For many applied questions, it is pivotal to know that the solution will return to the stationary values/orbits upon perturbation [1][2][3][4]. Examples include, but are not limited to: ecology and population dynamics; [5,6]; (celestial) mechanics; different areas of engineering; banking systems [7]; communication networks [4]; and neuroscience [8]. Different studies have shown that very diverse complex dynamical systems can be modelled using similar mathematical tools [9,10]. Formal aspects of stability have been studied extensively in the analysis of such complex systems, as well as in applied mathematics.
For linear time-invariant systems, the Routh-Hurwitz criterion sets out the conditions for global stability. More generally, the local stability of an equilibrium state of a non-linear ordinary differential equation (ODE) system can be assessed by inspecting the eigenvalues of the system's Jacobian matrix evaluated at the equilibrium [11]. If the real parts of the eigenvalues are all negative, then the equilibrium is (locally) stable. For any nonlinear systems, only local stability is implied by a negative leading eigenvalue. Given our interest in typically non-linear systems, we here consider the spectra of the Jacobians (or the sign of the largest eigenvalue, to be precise) directly, but keep in mind that the basin of stability may be finite, and potentially confined to a small region.
In ecology, where the analysis has perhaps been particularly divisive, early studies suggested that complexity -or ecological diversity-was key to stability [12,13]. First Gardner and Ashby [14] (who considered the problem from a wider perspective, with implications for examples as diverse as airport traffic and the human brain), and then May [5,15], investigated the way in which stability changes as complexity increases. In these examples, complexity was defined in terms of the number of state variables (e.g. in the case May considers, the number of species) and the probability of an interaction between two variables.
In order to generalise the analysis of the relationship between complexity and system stability, May avoided focusing on specific examples and instead considered ensembles of Jacobians defined in terms of matrix probability distributions. For suitably defined random matrix ensembles (RME) [16][17][18] he showed that sufficiently large or complex systems have a probability of stability close to zero. Subsequent studies considered different RMEs designed to reflect a variety of features found in real systems, and have drawn different or more nuanced conclusions regarding stability [19][20][21]. Other authors, especially from the ecological sphere, have pointed out the lack of realism of this approach [22] and estimated stability probabilities for specific ODE systems, either through experiments [23,24] or by sampling values for the system's parameters and-for each sample-identifying the equilibrium points (EPs) and determining their stability [25][26][27][28][29][30][31][32]. By repeatedly sampling in this way, Monte Carlo estimates of stability probabilities may be obtained. The advantage of such Monte Carlo approaches is that it is possible to condition on properties such as the feasibility of EPs (i.e. whether or not they are physically meaningful), which can again yield different conclusions regarding stability [25]. These approaches also define RMEs, but do so implicitly and with reference to specific ODE models. Given the variety of conclusions that have been drawn by different authors, the choice of RME is clearly crucial in determining the stability probability [4].
Random matrices have also, of course, a distinguished track-record in different branches of physics. Following Wigner's earliest work on calculating fluctuations in the eigenvalue spectra of Hamiltonians describing atomic nuclei [33], they have found use in the analysis of a whole range of fluctuations in different application areas: solid state physics, chemical reactions and transition state theory, and quantum chaos [34] are only some of the areas where they, in particular in the guise of the Gaussian Orthogonal Ensemble and its generalisations, have come to use. But RMEs have also been found useful in pure mathematics [35,36], and, for example, the spectral properties of the Gaussian Unitary Ensemble capture the statistical properties of the zeros of Riemann's zeta function; more recently they have also been employed in cryptography.
In all these applications RMEs are used to describe fluctuations which are believed to be separable from the secular dynamics of the underlying system. Here our use is subtly different. Instead of considering RMEs as general descriptors of some system-this has also been the strategy of May and, perhaps to a lesser extent, his followers-we are trying to condition the RME on the properties of real systems that determine whether or not the stationary states are stable or not. This then allows us to calculate a probability for a system to become unstable upon a small but finite perturbation. So, rather than making general statements about stability, our RMEs-which we refer to as conditional RMEs-are explicitly geared towards being used in specific contexts. While the success of traditional RMEs in capturing universal dynamics is based on assuming symmetries and homogeneity in the matrix entries, the stability analysis of specific real-world systems requires our conditional RMEs to exhibit the same heterogeneities that characterise real-world (i.e. problem-derived) Jacobian matrices. We will show below that this is necessary in order to understand when and why a large dynamical system can be stable, but that this fully conditioned RME should not be used to draw general conclusions, as any rule from this system-specific approach can only highlight the behaviour of that system or systems with similar dynamics.

Stability and random matrix ensembles
For any particular parametric ODE model, the Jacobian matrix will usually exhibit structure and dependency between its entries, and will typically be a function of the model parameters and the state variables. The present work addresses the question of how assessments of stability change when the structure and dependency present in the Jacobian is properly taken into account.
For example, for the Lorenz system of ODEs (see supplementary information for details), the Jacobian is given by, .
As a consequence of this structure and dependency, and regardless of how we choose the parameters of the system, only a particular family of n × n real matrices will be obtainable as Jacobians of the system. For example, no matter what values we take for the parameters of the Lorenz system, the (1, 3)-entry of the Jacobian matrix will always be zero, and the (1, 1)-entry will always be equal to the negative of the (1, 2)-entry. It follows that if we were interested in assessing the stability probability of one of the Lorenz system's EPs, it would be inappropriate to calculate | P h (stable ) using, for example, a matrix probability density function, h, that associates non-zero density with matrices for which the (1,3)-entry is non-zero or (1, 1)≠ −(1, 2). Nevertheless, many previous analyses have failed to account for the structure and dependency present in realistic Jacobian matrices (i.e. ones derived from models of real systems), instead restricting attention to matrix probability density functions that yield analytically tractable results and assuming that the results so obtained were general.

An illustration
To further illustrate the implications of neglecting Jacobian structure, we consider a number of examples from a family of ODEs whose Jacobians have the form , . In this case, the space of all matrices can be straightforwardly represented as a 2-dimensional Cartesian coordinate plane, in which the abscissa describes the value taken by a and the ordinate the value taken by b (as in figure 1). More precisely, we consider systems of the form, 1 We illustrate regions of the plane which correspond to the Jacobians that may be obtained for the various ODE systems and equilibrium points considered in examples 1-3 (blue shaded area, and red, green, and purple lines, as indicated). We also represent using contours the random matrix distribution that has traditionally been considered in the literature when assessing stability probabilities. whose Jacobians are given by, with θ 1 and θ 2 both non-zero. In this case, the Jacobian for the system is which is again a function of θ θ , 1 2 , and y. In this case, it is again straightforward to show that the only EPs that exist for all permitted values of θ 1 The Jacobian evaluated at EP1 is again θ − − The eigenvalues of J are the solutions of this equation, and are given by λ = − ± ab 1 . 1 ,as illustrated in figure 1. We also show the regions representing the Jacobians evaluated at the EPs for the systems considered in examples 1-3. Wherever these regions intersect the stable region, the corresponding EP(s) will be stable. The probability of a particular system being stable around a given EP, x 0 , is therefore equivalent to the probability of the relevant Jacobian evaluated at x 0 falling within one of these intersections. We consider how this probability should be defined in the next section. However, it is clear that if we ignore the existence of these regions when defining h, the matrix probability density function, and instead choose h in an arbitrary manner for the sake of analytical tractability (as illustrated by the contour lines in figure 1), then the resulting value we obtain for the 'stability probability' will be similarly arbitrary, and hence have little meaning or validity for any specific model.

Formal description
For any system of n ODEs, the Jacobian matrix of the system evaluated at a particular EP x 0 will be an element, J, of the set of × n n real matrices  M ( ) n . The EP x 0 is locally stable if all of the eigenvalues of J have negative real part. An equivalent criterion is that the real part of the leading eigenvalue (i.e. the one having maximal real part) is negative.
We first consider the set of all × n n real matrices, n where I n denotes the n × n identity matrix [11]. We define Λ ⊂  M ( ) S n n to be the set of × n n matrices having all negative eigenvalues, and refer to this as the stable region of The choice of a particular RME specifies a probability density function, h, on  M ( ) n . The stability probability associated with h is then J S n i.e. it is the total probability mass that falls within the stable region. Crucially, the stability probability is determined by two factors: Λ S n and h. Λ S n is not random: for a given n it is a well-defined region of  M ( ) n . For systems of practical interest, however, this area cannot be determined analytically, and needs instead to be evaluated computationally, using e.g. Monte Carlo techniques (which are outlined below in 2.3). The results of stability analyses will therefore be completely determined by the choice of h, and how it distributes probability mass over the stable and unstable regions (see figure 2). If h is defined through the specification of an ODE model and a distribution for its parameters, then only matrices that can occur as Jacobians for that particular system will have non-zero density. Similarly, if h is defined without reference to a real system, then only some of the matrices in  M ( ) n will be associated with non-zero density. However, for any given real ODE system, these matrices might not be obtainable as Jacobians, and-conversely -not all matrices obtainable as Jacobians will necessarily be associated with non-zero density by such a defined h, therefore limiting its relevance.
An appropriate choice of h is thus vital. In particular, choosing h for the sake of mathematical convenience can only provide limited insight, if doing so comes at the cost of sacrificing realism. The so-called 'diversitystability debate' [6] arose because general conclusions about stability were drawn from RMEs that were either specific to a particular system [25][26][27][28][29][30][31][32] or chosen for mathematical convenience [19,20]-e.g. to invoke the circular law [5,15,21]-yielding results unlikely to be meaningful for any interesting realistic system [37].
Here we show that the dichotomy resulting from the use of different RMEs can be overcome by constructing RMEs that are appropriate for, and conditioned on, the properties of Jacobian matrices of real systems.

Monte Carlo estimates of stability probability
We consider autonomous ODE systems of the form is the vector of state variables at time t, θ is the vector of parameters, and By definition, an EP, θ x * , of the system has the property: . * We include the subscript θ in our notation for the EP to emphasise that its location, existence, and stability will generally depend upon the particular values taken by the parameters. We denote by θ J * the Jacobian matrix of f evaluated at θ x * . We can induce an ensemble, , of Jacobian matrices by specifying a distribution,  , for the parameters θ: a collection of N parameter vectors θ θ … , , N ( ) . For any such RME, we may calculate a Monte Carlo estimate of the probability of stability, simply as the proportion of matrices that are stable; i.e. for which the leading eigenvalue has negative real part. That is, we obtain an estimate of the stability probability as, where  X ( )is the indicator function, which is 1 if X is true and 0 otherwise.

Conditional random matrix ensembles
The estimated stability probability defined in the previous section is the probability of a system being stable, conditional on a given system architecture; as discussed in section 2 and illustrated in section 2.1 such architectures do not arise without a concrete context. However, the conditions for which the circular law is believed to hold lack this connection to reality, at least for mesoscopic systems. To study the effects of this context, which is encapsulated by the statistical properties of, and dependencies among, the entries in the Jacobian matrices … θ θ J J , , ( ) , we consider two further random matrix distributions, constructed by permutation of the entries of our original RME. First, we form a new matrix ensemble, * , in which the dependency between entries is broken.
, with q drawn uniformly at random from …N {1, }. In this way, the marginal distribution of the ij-entries across the ensemble of K* matrices is the same as the marginal distribution of ij-entries across the ensemble of θ J * matrices. Maintaining the marginal distributions ensures that the dependency between entries is the only quantity that we are altering: in particular, the location of zeros in the matrix and the magnitudes of interaction strengths are maintained. We construct a further RME, , with q drawn uniformly at random from …N {1, }, and r and s (independently) drawn uniformly at random from …p {1, }. Now, the location of zeros in the matrix is no longer fixed; although the probability of an entry being zero is the same for the L* matrices as for the θ J * ʼs and K*ʼs. Moreover, each entry of the L* matrices is i.i.d. We henceforth refer to the θ J * matrices as the fully conditioned system (FCS) ensemble (most structure); the K* matrices as the independent ensemble (intermediate structure); and the L* matrices as the i.i.d. ensemble (least structure). We illustrate the properties of these three RMEs, and the methods for their construction, in figure 3.
To further our investigation we defined four more RMEs (which are presented in more detail in the supplementary information). The first one will be referred to as the independent normal ensemble. It is constructed as follows: For each (i, j), we fit an independent normal distribution to the ij-entries of the sampled Jacobians, ind . By construction, the mean and standard deviation of the ij-entries across the ensemble of M ind matrices are the same as the mean and standard deviation of the ij-entries across the ensemble of θ J * matrices (the FCS ensemble) and across the ensemble of K* matrices (the independent ensemble). A further ensemble is given by the independent Pearson ensemble. As in the independent normal case defined above, this new RME is defined by fitting a distribution to the ij entries of the sampled Jacobians, except that rather than using a normal distribution and just capturing the mean and standard deviation, we also capture the skewness and kurtosis of the ij-entries of the θ J * matrices. That is, in addition to μ i j . This RME thus shares many of the properties of the marginal distributions of ij-entries across the ensemble of θ J * matrices, but does not capture the dependencies between them.
The third additional RME will be referred to as the i.i.d. normal ensemble. This time we will not fit a distribution to the ij-entries of the θ J * matrices, but instead we fit a normal distribution, using the same technique that we used for the independent normal ensemble defined above, to the ij-entries of the L* matrices (i.e. those from the i.i.d. ensemble ).
Finally, we construct an RME that attempts to capture some of the dependencies between the entries of the . We will call this new ensemble the multivariate normal ensemble.
These new ensembles allow us to control which aspect of the structure of the FCS gives it its stability properties. For instance comparing the independent normal ensemble, the independent Pearson ensemble and the independent ensemble we can show the impact of the different moments of the distribution. The multivariate normal and FCS ensembles can be used for the same purpose in the case where dependencies are considered. More detail about the different RMEs is provided in the supplementary information.

Results
3.1. RME choice determines stability assessment Stability, as stressed above and in the literature, is an issue in a wide variety of domains, and therefore we consider a set of systems that cover different qualitative behaviour of dynamical systems. The four ODE models that we consider have in common that the EPs and Jacobians can be identified analytically, which makes analysis straightforward; they are: (i) the Lorenz system [38]; (ii) a model of the cell cycle due to Tyson [39]; (iii) a model of viral dynamics due to Nowak and Bangham [40]; and (iv) an SEIR (susceptible-exposed-infective-recovered) population dynamics model [41]. For brevity, we will refer to these four example models as: (i) 'Lorenz'; (ii) 'Tyson'; (iii) 'N & B'; and (iv) 'SEIR'. In each case, we present results for physically or biologically feasible EPs and generate 100 000 matrices from our RMEs in order to obtain Monte Carlo estimates of stability probabilities. Full details of these models and their corresponding RMEs are provided in the supplementary information. Figure 4(A) shows the eigenspectra for the FCS, independent, and i.i.d. RME regimes. While the i.i.d. eigenspectra are broadly circular, we observe diverse and decidedly non-circular shapes for the other two cases, highlighting the limitations of previous analyses based upon the circular law. Figure 4(B) shows the density of eigenvalues in the complex plane for the different models and RMEs. The eigenspectra distributions are typically much less dispersed for the FCS ensemble than for the other two. As shown in figure 4(C) this also leads to systematic differences in the real parts of the leading eigenvalues, which determine stability. Table 1 shows that, whatever the model considered, the probability of stability of the i.i.d. ensemble is always very different from the stability probability of the FCS. The RMEs that include more of the structure of the  system in their construction, yield stability probabilities closer to those of the FCS. In most cases, the univariate Pearson ensemble had a probability stability closer to that of the FCS than the univariate normal, showing that considering more moments when building the RME improves the estimation of the stability of the system. In summary, Monte Carlo estimates for the stability probabilities decrease as we decrease the amount of realism captured by the RME. Thus, failing to condition on the real-world heterogeneity and dependency present in the Jacobian can result in an unnecessarily pessimistic assessment of stability, even for these small systems. Considering RMEs in which we tightly control the mean, standard deviation, skewness and kurtosis of the marginal distributions of Jacobian entries, demonstrates that all of these properties also have an impact on stability.

Large dynamical systems can be stable
To illustrate the effects of inadequately capturing model structure and parameter dependencies on the stability probabilities of larger systems, we consider extensions to the SEIR model ( figure 5(A)). We allow for multiple subpopulations of exposed (E) individuals (in the supplementary information we also investigate extensions with heterogeneous infective subpopulations), enabling us to control system size. We again consider the three RMEs described above (see supplementary information for full details). As we increase the size of the system, the probability of stability remains 1 in the FCS case, but rapidly diminishes in the i.i.d. case ( figure 5(B)). The independent case is intermediate, indicating that not only can the dependence between matrix entries be important, but also their heterogeneity. Heterogeneity changes the location of the centre of the matrix p.d.f. h, and also how it stretches in different directions, which modifies the proportion of probability mass falling in the stable region, Λ ⊂  M ( ) S n n . Figure 5(C) shows how summaries of the distributions of leading eigenvalues change as we increase the number of exposed populations, with figure 5(D) providing corresponding density estimates for a selection of these numbers. In the i.i.d. case, the distributions and median values shift away from stable negative values toward unstable positive values. This is in stark contrast to the FCS case where, regardless of the number of exposed populations, the distribution of leading eigenvalues only has support on the negative real line (and hence we always have stability probability 1). Moreover, as we increase the number of exposed populations, the median value of the leading eigenvalue tends to become more negative. The independent case is again intermediate, with the median value staying relatively constant. Figures S4 and S5 in the supplementary material, where we consider the effect of the i.i.d. normal, the independent normal, the independent Pearson, and the multivariate normal ensembles on these models, bring more evidence to our previous observations. As we would expect, the more of the underlying system's structure that we capture using our chosen RME, the closer we get to the stability probability estimated using the FCS ensemble. The i.i.d. normal RME, which does not include any more structure than the i.i.d. ensemble, leads to similar stability probabilities to those obtained using that RME. The independent normal, which allows the heterogeneity of variances and means found in the real system to be described, yields stability probabilities closer to the FCS ensemble. The independent Pearson, which also takes into account information about the skewness and kurtosis of each entry, gets even closer to the FCS ensemble, and is very similar to the independent case. Finally, the multivariate normal RME (which allows us to model-in a simplistic fashion-some of the dependencies between the entries of the Jacobian) results in stability probabilities that are always closer to the FCS ensemble than those obtained using the independent normal RME. We found that the stability probabilities obtained using the multivariate normal RME are not consistently more different from FCS stability probabilities than those obtained using the independent Pearson RME. Thus, accounting for dependency between the Jacobian's entries may, depending on the problem at hand, be more or less important than accounting for higher order properties of the marginal distributions of the entries.

Discussion
The stability of real-world systems-which is nearly ubiquitously observed [1][2][3]-might seem perplexing in light of classical results in random matrix theory. By considering how random matrices can be made to reflect the properties of the Jacobian matrix of real dynamical systems it becomes possible to resolve and reconcile apparent contradictions in the literature [5,30,31]. In agreement with previous authors, our results demonstrate that stability probability can be affected by the mean and standard deviation of the entries in the Jacobian, as well as the dependencies between them [30,31], and further show that properties such as the skewness and kurtosis of the entries can also have an impact. This is unsurprising: as is clear from equation (2)  . Reported stability probabilities should therefore always explicitly acknowledge that they are conditioned on a particular choice of RME, which has to be carefully justified.
While May's mathematical study clearly shows that in some circumstances an increase in complexity can lead to instability [5], Haydon's study highlights cases where complexity, in the shape of strong and numerous interconnections, is necessary to get higher stability [19]. Other examples where complex systems have been shown to be stable can be found in the literature, in particular Kokkoris et al show that different variances of the interaction strength will allow for different levels of complexity of the system while keeping the same stability [30]. However, none of these results can be generalised lightly. They in fact show that different systems are impacted differently by changes in complexity, and that no general prediction can be made.
Here, the FCS ensemble conditions on the model structure, so that the RME is defined through the distribution of model parameters. In our case, these distributions have been chosen by selecting plausible or interesting ranges for the parameters, and taking uniform distributions within these ranges (in the supplementary information we consider alternative possibilities). In real applications, a natural choice for the distribution of model parameters would be provided in the Bayesian formalism by the posterior parameter distribution (or, if no data are available, by the prior). From this, we may obtain the posterior (or prior) predictive distributions [42] of the leading eigenvalues, from which we may derive the probability of stability. In this way, a truly realistic assessment of stability may be obtained for the system, in which we have conditioned on both our current understanding of the system architecture (encapsulated in our mathematical model) and our current uncertainty in the model's parameters.
Through our analyses, we have demonstrated that identifying any single property of the RME as being the general determinant of stability is misleading, except in some cases when the system has been very strictly defined [20,27,30,31]. Stability probability is determined by the topology of the stable region, and how much probability mass is deposited within that region by the RME. This cannot be summarily deduced from any single property of the RME. At this stage it seems that system stability is system specific and that little can be gleaned from general approaches that will at best be uninformative if not entirely misleading. In particular, we cannot assess the probability of a system being stable based only on its size, diversity or complexity. It is especially important to keep this in mind as the stability of more complicated systems is considered (see e.g. [7,43]).
This does not rule out the possibility that there are sets of rules or principles that could greatly shift the balance in favour of stability. Negative feedback, for example, is likely to lead to more stable behaviour (even for stochastic systems). It is in principle possible to condition on such local structures (in terms of correlated entries in the Jacobian) that may confer or contribute to overall stability of a system [44]. To apply such knowledge to real systems, however, would require a level of certainty about the underlying mechanisms that we currently lack for all but the most basic examples. Nevertheless, even in the presence of uncertainty about system structures, local negative feedback between species, for example, would tend to favour stability, whereas positive feedback (or merely the lack of negative feedback) would typically result in amplification of initially small perturbations to a system's behaviour [45].

Appendix. Methods
Two main methods were used. The first used an analytical approach, while the second used a numerical approach. The first method was used on models 1 to 4, and on models 5 and 6 for n = 1-6. In these cases the number of samples used was 100 000. The second method was used on model 5 and 6 for n = 1-10 and n = 20, 30,…,100, this time, for computational reasons, the sample size was 1000.
A.1. The analytical method For each of the models, the EPs were found using Matlab's analytic equation solver, solve. The Jacobian was also described analytically using Matlab's jacobian function.
The different parameters were sampled from a uniform distribution using Matlab's rand function. The choice of the range of the parameters will be described in detail below.
(x) The probability of a system being stable is then the number of stable systems divided by the number of samples.
In order to evaluate the stability under the independent and i.i.d. conditions we add a step between steps 8 and 9: we process the Jacobian matrices by doing some permutations of the entries as described in section 2.4.

A.3. Parameter ranges
Two criteria were used to choose the range of the parameters: first they had to be realistic; second the range had to be small enough to allow a thorough sampling of the space obtained to be computationally tractable. In the cases when the ranges were fixed arbitrarily, we verified that choosing different ranges would not impact the qualitative results.
ρ is considered to be bigger or equal to 1 in order to ensure more than just the origin as a equilibrium point, which makes our study more interesting.
A.3.2. Model 2: a model of the cell division cycle. All the parameters were taken to be uniformly distributed between 0 and 1.
A.3.3. Model 3: the Nowak and Bangham model. All the parameters were taken to be uniformly distributed between 0 and 1.