Symmetric and Asymmetric Tendencies resulting from Eigenvalue bounds of the Community Matrix

A commonly used approach to study stability in a complex system is by analyzing the Jacobian matrix at an equilibrium point of a dynamical system. The equilibrium point is stable if all eigenvalues have negative real parts. Here, by obtaining eigenvalue bounds of the Jacobian, we show that stable complex systems will favor cooperative relationships that are asymmetrical (non-reciprocative) and competitive relationships that are symmetrical (reciprocative). Additionally, we define a measure called the interdependence diversity that quantifies how distributed the dependencies are between the dynamical variables in the system. We find that increasing interdependence diversity has a destabilizing effect on the equilibrium point, and the effect is greater for competitive relationships than for cooperative relationships. These predictions are consistent with empirical observations in ecology. More importantly, our findings suggest stabilization algorithms that can apply very generally to a variety of complex systems.

Complex systems may undergo transitions between alternate stable states of contrasting behavior. Such a transition is called a critical transition or a regime shift in the literature [1]. Critical transitions are highly non-linear phenomena in that a small change in a controlling parameter such that a critical point is crossed can unexpectedly provoke a huge response (critical transition). Further away from the critical point, such a small change in the controlling parameter would only result in a comparable response without any critical transition. This non-linear response, along with the fact that critical transitions are common in nature [1][2][3][4][5][6], makes the study of critical transitions an important one. Critical transitions can happen as a result of instability in the stable state that the system was residing in.
In order to determine the stability of an equilibrium point, the simplest kind of stable state, a commonly used approach in non-linear dynamics is to linearize the dynamical equations describing the system about the equilibrium point. One obtains from this linearization the n×n Jacobian matrix B evaluated at the equilibrium point, with real matrix elements {b ij : i, j = 1, . . . , n} for a system with dynamical variables x = (x 1 , x 2 , . . . , x n ). The matrix B is also known as the community matrix. The equilibrium point is stable if all real parts of the eigenvalues of B are negative and unstable otherwise. Henceforth in this paper, we may refer to B being stable or unstable when we actually mean the equilibrium point associated with B being stable or unstable respectively.
The matrix element b ij describes the dependence of dynamical variable x i on dynamical variable x j , where i = j. Conversely, b ij describes the dependence of x j on x i . We may also refer to b ij as an interaction and its magnitude as its interaction strength. Here, we define the product b ij b ji to be the relationship between x i and x j . The relationship between x i and x j is cooperative if b ij b ji > 0 and competitive if b ij b ji < 0. Cooperative and competitive relationships defined here are also known as mutual and trophic relationships respectively in the ecology literature. A relationship is symmetrical when b ij and b ji are of comparable magnitudes and is asymmetrical otherwise. The main result of this paper involves using eigenvalue bounds to show that stability in B favors mutualistic relationships that are asymmetrical and trophic relationships that are symmetrical. The analysis presented here stems from a rather old research question: how do the eigenvalues of B depend on its matrix elements?
Unfortunately, there is no exact answer to this question. An approach has been to use Random Matrix Theory (RMT), originally introduced by Wigner to study spectral properties of atomic nuclei [7]. RMT has since found applications in a wide variety of disciplines including number theory [8] and neuroscience [9]. In ecology, RMT was used by Robert May to study the stability of a large ecological network at an equilibrium point [10]. In Mays seminal work, B is a random matrix, with off-diagonal matrix elements being independent and identical random variables of mean zero and variance σ 2 . The diagonal elements, set at -1, represent characteristic return rates for the populations of species when disturbed from equilibrium. May claimed that for large n, B is unstable when σ √ n > 1. The main criticism with Mays work is that real-world ecosystems are structured unlike the random matrix studied by May [11][12][13]. Allesina and Tang, relying on recent advances in RMT from the mathematics literature [14], recently confirmed Mays claim and further analyzed random matrices with various structures [15], alleviating some of the criticisms associated with Mays work. Research in RMT has hinted that high correlation between random variables b ij and b ji in mutualistic relationships has a destabilizing effect whereas low correlation in trophic relationships has a stabilizing effect on B [16,17]. This is very similar to the main result of this paper which uses an alternate approach with eigenvalue bounds. The significance of the work presented here is the generality of the result. The eigenvalue bounds are not contingent on any assumptions on B (besides the matrix elements being real) whereas related theorems and conjectures in RMT typically assume at the least that n is large and that matrix elements or pairs of matrix elements are independently and identically distributed. The drawback of our method is that we cannot obtain precise conditions for stability or instability beyond the observation that B will eventually become unstable if certain quantities become large enough.
Qualitative validation of the modeling in this paper comes from empirical observations in the ecology literature. It has been known for some time that mutualistic ecological networks like plant-pollinator networks consist of highly asymmetric interactions between plant and pollinator [11,18,19]. For example, the manduvi tree relies almost exclusively on the toco toucan for seed dispersal, but the toco toucan is not limited to the manduvi trees fruits in its diet [20]. Our model also allows us to define a measure called the interdependence diversity which quantifies how distributed the dependencies are between dynamical variables. Interdependence diversity may be used as a proxy for connectance, a measure that is well known in the ecology community. The connectance is the proportion of nonzero dependencies in B. Our calculation shows that increasing interdependence diversity has a destabilizing effect that is more pronounced for trophic relationships than mutualistic relationships. This prediction can also be seen in empirical data. In a meta-analysis of real-world pollination (mutualistic) and herbivory (trophic) networks controlling for n, Thbault and Fontaine found trophic networks to have a lower connectance than mutualistic networks [13].

Mathematical Formulation and Eigenvalue Bounds
We start with a dynamical system with n dynamical variables described by n arbitrary non-linear differential equations i.e.ẋ i = f i (x), where i = 1, . . . , n and x = (x 1 , . . . , x n ) is a vector of dynamical variables. x * is an equilibrium point if for every i, f i (x * ) = 0. The local stability of x * may be studied by linearizing the dynamical equations about x * [21]. The linearization furnishes B, the Jacobian matrix evaluated at x * . The matrix element b ij , which we described as the dependence of x i on x j , is the gradient of f i (x) along x j at x * i.e. ∂f i /∂x j (x * ). The equilibrium point is stable when any perturbation of x from x * decays with time. Conversely, the equilibrium point is unstable when any perturbation of from grows with time. Stability is determined by the eigenvalues of B. The equilibrium point is stable if the real parts of all eigenvalues are negative and is unstable otherwise. Equivalently, the equilibrium point is stable if the largest real part of all eigenvalues, which we shall refer to as the maximum real eigenvalue, is negative and unstable otherwise. Eigenvalues are the exponential decay rates of small perturbations from the equilibrium point. Thus, eigenvalues that are more negative indicate greater stability along their respective eigenvectors. Solving for the eigenvalues is equivalent to finding the roots of the characteristic polynomial det(B − λI), where λ is an eigenvalue of B. The eigenvalues depend on the matrix elements in a nontrivial fashion in part because there is no general algebraic expression for the roots of polynomials of the 5th degree or higher. This is the Abel-Ruffini theorem and is a well-known result from Galois theory. While others have resorted to RMT for this problem, we use an alternate approach with eigenvalue bounds to glean information about the eigenvalues dependence on the matrix elements.
Given the multiset of eigenvalues of B, {λ i : i = 1, . . . , n}, an upper and lower bound for the maximum real eigenvalue of B are respectively, Here,λ is the mean while σ λ is the standard deviation of the real parts of all eigenvalues. The upper bound is more well-known and was probably first discovered by Laguerre but is more commonly known as Samuelsons inequality [22,23]. The lower bound is due to Brunk [24]. The bounds may be given by an expression in terms of {b ij : i, j = 1, . . . , n} (Methods and Supplementary Information), Here, χ diag = i b ii is the diagonal sum, G diag is a function of the diagonal elements {b ii : i = 1, . . . , n} (Supplementary Information) and 2 is a non-negative number that is positive when the imaginary components are non-zero and zero otherwise. n is kept constant throughout our analysis. The mean of the eigenvalues is controlled by the diagonal elements i.e.λ = χ diag /n. Hence, the eigenvalues dependence on the diagonal elements is more straightforward and is generally of less interest than the off-diagonal elements. It follows from the bounds that λ + < 0 is a sufficient condition for stability while λ − > 0 is a sufficient condition for instability.
From Equation 3, we may then draw some conclusions on two cases: (i) the eigenvalues are real numbers, i.e. h = 0 (ii) the eigenvalues are complex numbers, i.e. h ≥ 0. For the first case, it is always possible to achieve stability or instability by decreasing or increasing χ off respectively. The first case shall not be analyzed further because of the strong assumption that all eigenvalues are real. Instead, we consider the second case, which is more general. For the second case, the upper bound becomes less useful but not the lower bound because h ≥ 0; we can still always increase χ off enough such that B becomes unstable. Therefore, while it is still necessary to keep χ off small enough for stability, keeping χ off small alone does not guarantee stability because of h. For this reason, other contributing factors would still need to be considered in order to form a complete picture of how stability arises in B. For example, ecologists have been obsessed with the nestedness, a persistent structural property observed in mutualistic networks [11,[25][26][27] In a nested architecture of a bipartite network, a more specialist species (defined as having fewer mutualistic interactions) would only interact with a proper subset of mutualistic partners of the more generalist species (defined as having more mutualistic interactions). However, there remains some controversy over how important nestedness is to the stability of mutualistic ecological networks [25]. Instead of delving into the details of specific structural properties, we will focus our efforts on minimizing χ off instead. χ off is the sum of all relationships in B. A natural way to minimize χ off is to make both interaction strengths in mutualistic relationships very weak. However, mutualistic relationships are pervasive in nature. Therefore, we need to constrain the interaction strengths in B so that minimization of χ off will not render both interaction strengths in B negligible. Then, minimization of χ off will require mutualistic relationships to be asymmetric in order to minimize each summand, b ij b ji i.e. one large and one small interaction strength. In the section on interdependence diversity and symmetric correlation, we will constrain the interaction strengths in B from an ecological standpoint and demonstrate that minimization of χ off will require mutualistic relationships to be asymmetrical and trophic relationships to be symmetrical.

Pruning Random Matrices for Stability
The results presented thus far suggests that minimization of χ off might provide an efficient route to stabilize an equilibrium point. We employ a simple algorithm on a well-known example, the random matrix studied by May10. For this example, consider the situation where M is a random matrix and its diagonal elements are set at −d while the off-diagonal elements are independently and identically distributed random variables of mean zero and variance σ 2 . According to RMT, for large n, the eigenvalues of M are contained in a circle of radius σ √ n centered at (−d, 0) on the complex plane [15]. For this example, we use n = 20, d = 2, a standard normal distribution for the off-diagonal elements and a modification factor g that we shall introduce in the description of the algorithm. The  (6) if the maximum real eigenvalue of after the modification is larger than before the modification, revert to step (2) using M before the modification; if the maximum real eigenvalue of M after the modification is smaller than before the modification instead, revert to step (2) using M after the modification. This counts as one iteration.
This algorithm employs a χ off -minimizing strategy due to step (4). We compare this algorithm using the χ off -minimizing strategy against the same algorithm using a random strategy and a variance-minimizing strategy. In the random strategy, step (4) is replaced by the following step instead: (4) b ij is randomly chosen to be multiplied or divided by g with probability . In the variance-minimizing strategy, step (4) is replaced by the following step instead: (4) b ij is divided by g. We compare the three strategies using 50,000 random matrices over 2,000 iterations. The results are shown in Figure 1. The χ off -minimizing strategy clearly outperforms the other two strategies. Of course, if we were to accept every modification without checking if it reduces the maximum real eigenvalue at every iteration, then the variance-minimizing strategy will eventually reduce all eigenvalues to −d. However, there are two reasons why such an algorithm might be undesirable: (i) the maximum real eigenvalue may at times increase with iteration number, and (ii) the eventual interaction strengths are small unlike the original algorithm with the χ off -minimizing strategy which allows for larger eventual interaction strengths. Sensitivity analysis of the parameter g reveals that the χ off -minimizing strategy still outperforms the other two strategies for the various values of g tested (Supplementary Information).

Interdependence Diversity and the Symmetric Correlation
Clearly, one cannot contemplate stabilizing B by rendering the interaction strengths in B negligible since interactions are ubiquitous in nature. Therefore, before further analysis, interaction strengths in B have to be constrained in some way. To do this, we first consider two sets of similar variables, y = {y k : k = 1, . . . , m} ⊂ x and z = {z l : l = 1, . . . , m} ⊂ x, where x is the set of all variables and y ∩ z = ∅. These two sets of variables are so defined to delineate interactions of a particular type between variables from the two sets. For example, if the interaction types are consumption and pollination, then the variables in y could represent populations of pollinators while the variables in z will represent populations of plants. We now formulate equations of constraints that allow variables in y to depend on various weighted combinations of the variables in z and vice versa. For notational convenience, let us denote Y k (x) to be Y k (x) =ẏ k and Z l (x) to be Z l (x) =ż l for all k and l. Then, we may find in B the matrix elements ∂Y k /∂z l (x * ) = d kl α k , which is the dependence of species y k on species z l , and ∂Z l /∂y k (x * ) = e lk β l , which is the dependence of species z l on species y k , for all k and l. Here, d kl and e lk are weights such that k d kl = 1, l d kl = 1, k e lk = 1, l e lk = 1, and 0 < d kl , e lk < 1. α k and β l are real numbers and because of the bounded weights, their absolute values are the maximum interaction strengths possible for the respective interactions (e.g. consumption and pollination) and species (y k and z l ) they pertain to. Ecologically, we do not expect a species population to be affected by the composition of its interactions of a particular type (e.g. consumption or pollination) with various other species. Furthermore, ecological interactions like consumption and pollination are expected to be additive. This forms the basis of our assumptions that the strengths of interactions belonging to a particular type (e.g. consumption or pollination) for a particular species (y k and z l ) are bounded by the same value (α k and β l ) and that the weights are constrained to sum to one ( l d kl = 1 and k e lk = 1 respectively). Additionally, we might also require that the dependencies on any species (y k or z l ) be constrained ( l e lk = 1 and k d kl = 1 respectively) although these additional constraints do not affect the conclusions that are about to follow (Theorem S1). Finally, differences in α k for all k and differences in β l for all l lead to biases in weight distribution amongst the variables in y and z when minimizing χ off . Since our goal is to evaluate the effect of the dispersion in the weights on the off-diagonal sum, we may simplify the problem by assuming that α k = α for all k and β l = β for all l.
The relationship between any variable y and any variable in z is mutualistic if αβ > 0 and trophic if αβ < 0. If we arrange the weights d kl into an m × m matrix D such that d kl is a matrix element of D, then αD is a submatrix of B. Similarly, e lk is a matrix element of E and βE is a submatrix of B. This convenience allows us to define a quantity C that we shall call the symmetric correlation, The symmetric correlation is a measure of correlation between the matrix elements of D and the corresponding matrix elements of E T . Since the relationship between y k and z l is d kl e lk αβ, the symmetric correlation can also be used as a relative measure of symmetry for relationships between variables in y and z when the dispersion of weights in D and E is fixed (i.e. when the interdependence diversity, which we will define shortly, is fixed). Given the weight constraints, C is bounded 0 < C < 1 (Theorem S2). We find that χ off contains the summand mαβC i.e. χ off = 2mαβC + . . . , and that the weight elements d kl and e lk for all k and l are contained exclusively in the summand 2mαβC of χ off . Thus, minimizing χ off for the relationship αβ requires minimizing C for mutualistic relationships and maximizing C for trophic relationships i.e. mutualistic relationships will be asymmetric whereas trophic relationships will be symmetric. Next, we define the interdependence diversity, a measure of the diversity of dependencies among the y and z variables. S is simply the sum of all the squared weight elements. Furthermore, due to the weight constraints, S is bounded −2m < S ≤ −2. When S = −2m at minimum interdependence diversity, then all weights are either zeros or ones. When S = −2 at maximum interdependence diversity, then all weights are equal to 1/m. The interdependence diversity defined here is similar to the Herfindahl index in economics [28] or the Simpson index in ecology [29]. We denote max(C)| S and min(C)| S to be respectively the maximum and minimum C under any variation of weights and under a fixed S (without violating the weight constraints). The relationships of max(C)| S and min(C)| S with S are shown using numerical calculations in Figure 2. Analytical calculations can be found in the Supplementary Information. Minimizing χ off means that mutualistic relationships will reside on the min(C)| S curve while trophic relationships will reside on the max(C)| S curve. Both max(C)| S and min(C)| S are monotonically decreasing and increasing functions of S respectively ( Figure  2). Hence, the capacity of B to minimize χ off decreases with increasing interdependence diversity for both mutualistic and trophic relationships. Additionally, because max(C)| S is more adversely affected than min(C)| S with increasing interdependence diversity, this effect is more pronounced in trophic relationships than mutualistic relationships. Essentially, trophic relationships are more affected than mutualistic relationships with increasing interdependence diversity because there exists many more possibilities in the network to minimize C. Only when the network is fully connected with equally weighted onedirectional links does min(C)| S start increasing with S (Proposition S2).

Discussion
In this work, we have derived eigenvalue bounds for the maximum real eigenvalue of B . The generality of this result and subsequent calculations allows us to consider different types of interactions in concert, something that was limited in previous studies with RMT due to assumptions on matrix elements being independently and identically distributed. Additionally, we show that two observations, increasing interdependence diversity causing destabilization and this destabilization being more pronounced for trophic than mutualistic relationships, can both be explained as a result of B losing its capacity to accommodate symmetric and asymmetric relationships. Empirical validation of our calculations demonstrates our approach to be promising for further investigations of stability in B.
Our results highlight the importance of asymmetry and symmetry in mutualistic and trophic relationships respectively to the stability of a complex system. Identifying and understanding the contributing factors to stability can be used to help design algorithms to stabilize real-world systems on the verge of critical transitions. For example, the stabilization algorithm described in this paper could be implemented in real-world systems by using critical slowing down signals to measure the change in stability at every iteration (step (5) of the algorithm). Critical slowing down signals are statistical signals that can be used to detect if a stable state is becoming more unstable. These signals have been detected in a wide variety of real-world systems [1]. They are based on the premise of a slower return rate to the stable state after a perturbation as the stable state becomes more unstable [30]. While there have been ample studies on the detection of critical slowing down signals, more research needs to be conducted on the stabilization of potentially unstable stable states.
Stabilization is one way to deal with critical transitions. A recent attempt at this problem involves smoothening the non-linearity of a critical transition [31]. Network properties not covered in this work can also be very important in dealing with instability. For a formerly stable equilibrium point, initial instability occurs when the maximum real eigenvalue goes above zero. The eigenvector(s) of the maximum real eigenvalue determine the initial directions of instability and which variables will be initially affected by this instability. As the system transitions away from the previously stable equilibrium point, more and more variables might be affected depending on their dependence on the initially and subsequently affected variables in what is known as a cascade of failures. Whether a not such an initial instability will eventually lead to system-wide instability depends on a multitude of factors including the structure of the network connecting these variables and how the system responds to this initial instability. For example, in a load bearing network with a heterogeneous degree distribution, the failure of a single node with a large number of dependencies can cause a large cascade of failures [32]. The effect of initial instability or failure on the whole system is a topic of great interest in network science [32][33][34]. While there remains a host of factors that ultimately determine stability in a complex system, the generality of our results suggests that asymmetry in mutualistic relationships and symmetry in trophic relationships should be universally observed and not restricted to ecology.

Methods
Derivation of eigenvalue bounds. The polynomial equation is det(B − λI) = λ n + c 1 λ n−1 + c 2 λ n−2 + . . . . We may express both bounds in terms of c 1 , c 2 and h using Vites formulas and the complex conjugate root theorem. The relation between the matrix elements and the coefficients c 1 and c 2 can be found by expanding the Leibniz formula for matrix determinants. This gives us the bounds in terms of the matrix elements of B. A more detailed derivation may be found in the Supplementary Information.
Obtaining the numerical results of Figure 2. To obtain max(C)| S , we (1) construct 5 × 5 matrices D max and E max at max(C)| S when S is at the minimum of −2m; D max and E max are initial starting points for a nonlinear constrained optimization (maximization) algorithm implemented in MATLAB (fmincon function with sqp algorithm where the constraints for the optimization problem are the weight constraints 0 < d kl , e lk < 1, k d kl = 1, l d kl = 1, k e lk = 1, and l e lk = 1, and the interdependence diversity constraint S = k,l d kl 2 + e lk 2 = −2m, while the objective function is C), (2) carry out the optimization for the starting point and constraints, and (3) use the solution as the new weight matrices for the starting point of the next optimization where the interdependence diversity is fixed at a positive increment ε = 0.001 from the previous optimization. Steps (2) and (3) are repeated until the maximum interdependence diversity is reached at −2.
To obtain min(C)| S , we use the same steps, replacing the initial starting point with D min and E min at min(C)| S when S = −2m and using the same optimization algorithm but with minimization instead.

Supplementary Information Derivation of eigenvalue bounds
Given the multiset of eigenvalues {λ i : i = 1, . . . , n} of the matrix B, the eigenvalue bounds for the maximum real part of all eigenvalues are , whereλ is the mean while σ λ is the standard deviation of the real parts of all eigenvalues.
Here,λ = i Re(λ i )/n and If we denote the polynomial equation as det(λI − B) = λ n + c 1 λ n−1 + c 2 λ n−2 + . . . , then we find using Viète's formulas and the complex conjugate root theorem that Expanding the Leibniz formula for determinants gives us where

Sensitivity analysis of the g parameter in the stabilization algorithm
The parameter −d translates the eigenvalues and the parameter σ scales the eigenvalues. Hence these two parameters are unimportant for the sensitivity analysis. The sensitivity analysis is conducted for the g parameter instead, where g > 1. The algorithm is performed on 1,000 random matrices up to an iteration length of 2,000 for various values of g from 1.1 to 10. The parameters used are n = 20, d = 2, and random variables drawn from a standard normal distribution. Results of the sensitivity analysis are shown in Figure 3 and do not indicate that the χ off -minimizing strategy is worse performing than the other two strategies for any value of g tested. Figure 3: The average of the maximum real eigenvalues belonging to 1,000 random matrices at the end of 2,000 iterations is plotted against g for the three different strategies described in the main text.

Proofs
Analytical calculations of the C vs S relationship Definition S1 C is defined as where k and l are indices such that k, l ∈ {1, . . . m} and m is a positive integer i.e. m ∈ N. The weights w kl 1 and w lk 2 for all k and l satisfy the weight constraints k w kl 1 = 1, l w kl 1 = 1, l w lk 2 = 1, k w lk 2 = 1 and 0 ≤ w kl 1 , w lk 2 ≤ 1. Additionally, w kl 1 is a matrix element of W 1 with row index k and column index l. Similarly, w lk 2 is a matrix element of W 2 with row index l and column index k.
which is the negative sum of all the squared matrix elements of W 1 and W 2 of which W 1 and W 2 are elements respectively.
S is a measure of diversity between the weight elements. A well known property of such a definition is that S is maximized at S = −2 when all weight elements are equal at 1/m and minimized at S = −2m when all weight elements are equal to one or zero.
Proposition S1 Let min(C) and max(C) denote respectively the minimum and maximum value of C under a variation of the weight elements fulfilling the weight constraints.
Then min(C) = 0. Let max(C)| S and min(C)| S denote respectively the minimum and maximum value of C under a variation of the weight elements fulfilling the weight constraints and under fixed S. Then, min(C)| S=−2m = 0.
Since each weight is non-negative, each summand of C must also be non-negative. Hence, C is non-negative. Additionally, for C = 0, each summand of C must be equal to zero. Such a situation is possible when every matrix element is either zero or one at S = −2m. Hence, min(C)| S=−2m = 0.
Proof. Let {W 1 , W 2 } represent a matrix pair such that C(W 1 , W 2 ) = 0. In order for min(C) = 0, it is necessary that all summands of C are equal to zero. Hence, the total number of zeros from both matrices should be at least m 2 . Let max(S 0 ) be the maximum S such that C = 0. Also, let {W 1 , W 2 } be a matrix pair such that C(W 1 , W 2 ) = 0 and S(W 1 , W 2 ) = max(S 0 ).
(a) When m is even, suppose that all weight constraints, k w kl 1 = 1, l w kl 1 = 1, l w lk 2 = 1 and k w lk 2 = 1 for all k and l, are replaced by a less restrictive constraint k,l w kl 1 + w lk 2 = 2m. Then for C = 0, S is maximized if each summand of C is the multiplication of zero and 2/m 2 . However, each summand of C can still be a multiplication of zero and 2/m 2 if we were to use the original constraints instead. This is because it is possible to arrange m/2 zeroes in each row and column of each matrix without violating any weight constraints. In this case, max(S 0 ) = −4. By Proposition S1, the result stated is obtained.
(b) When m is odd, the non-zero elements of each row in a matrix must still be the same value within each row to maximize S. However, because m is odd, the nonzero elements cannot be 2/m 2 . Also, to maximize S, there must not be more than a total of m 2 zeroes since any summand that is a multiplication of two zeros can still have S increased. This is done by increasing the number of non-zero elements of either row possessing any of the two zeros. Since there is exactly a total of m 2 zeros in both matrices, then there must be m rows containing m + 1 zeros and m rows containing m − 1 zeros because any row that has more than m + 1 zeros or less than m − 1 zeros implies that S is not maximized. Lastly, it is possible to arrange m rows of m + 1 zeros in one matrix and m rows of m − 1 zeros in the other matrix without violating any weight constraints. Therefore, max(S 0 ) = −4m 2 /(m 2 − 1). By Proposition S1, the result stated is obtained.
Theorem S1 max(C)| S is When m is even, min(C)| S is When m is odd, min(C)| S is The extrema may be found by using the method of Lagrange multipliers. In this case, the function to maximize/minimize is C = k l w kl 1 w lk 2 , where C = mC. The constraints are ∀k : l w kl 1 = 1, ∀l : k w lk 2 = 1, and S = − k l (w kl The constraints ∀l : k w kl 1 and ∀k : l w lk 2 are not used, but we will show that the the solutions obtained from the simplified Lagrange multiplier problem can satisfy these two constraints that were left out. At the extrema, the gradient of C along a variable w kl 1 is parallel to the gradient of the constraints along w kl Similarly, the gradient of C along a variable w lk 2 is parallel to the gradient of the constraints along w lk Here, ξ l , ζ k , and ρ are Lagrange multipliers. There are thus 2m 2 equations corresponding to the 2m 2 variables. Together with the constraints, there are 2m 2 + 2m + 1 equations corresponding to 2m 2 variables and 2m + 1 Lagrange multipliers. We combine Equations 18 and 19 to obtain the following expressions for w kl 1 and w lk These expressions give us w kl 1 and w lk 2 provided ρ = ±1/2. Applying the constraint l w kl 1 = 1 to Equation 20 results in Since this equation must hold for any k, this implies that ∀k : ζ k = ζ. Similarly, applying the constraint k w lk 2 = 1 to Equation 21, we find that ∀l : ξ l = ξ. By Equations 20 and 21, the weight elements in matrices W 1 and W 2 must be the same. However, this result is only valid when S = −2. When S = −2, then this result cannot hold. Hence when S = −2, then ρ = ±1/2.
When ρ = −1/2, then summing all elements in W 1 and W 2 each using Equations 18 and 19 implies that ∀k, l : ζ k = ξ l = 0 since the elements in each matrix sum to m. Therefore, ∀k, l : w kl 1 = w lk 2 and S = −2C . This solution is the maximum as it does not preclude the possibility that the weights are non-negative within the domain of S. Additionally, since W 1 = W 2 T , then all weights in each column of W 1 and W 2 must sum to one.
When ρ = 1/2, we see that ∀k, l : ζ k = ξ l = ζ from Equation 20. Summing all elements in W 1 and W 2 gives us ζ = 2/m, ∀k, l : w kl 1 = 2/m − w lk 2 and S = 2C − 4. The solution S = 2C − 4 is the minimum within the domain of −4 ≤ S ≤ −2 if m is even because the minimum does not preclude the possibility that the weights are non-negative within the domain of −4 ≤ S ≤ −2. Additionally, since W 1 = (2/m)J − W 2 T , where J is an m × m matrix of ones, then each column of W 1 and W 2 must sum to one.
If w lk 2 = 0 and w kl 1 is variable, then by Equation 18, Similarly, if w kl 1 = 0 and w lk 2 is variable, then If both w kl 1 and w lk 2 are variable, then w kl 1 and w lk 2 are given by Equations 20 and 21 respectively provided ρ = ±1/2.
When ρ = −1/2, then ζ k = −ξ l if w kl 1 and w lk 2 are variable by Equation 20. Given Equations 23 and 24, this implies that ∀k, l : ζ k = ξ l = 0. Hence if w kl 1 and w lk 2 are both variable, then from Equation 20, w kl 1 = w lk 2 = 1 since the weights in each row sum to one. Therefore, we are not interested in this solution.
With ρ = 1/2, then ζ k = ξ l if w kl 1 and w lk 2 are variable by Equation 20. Then, summing over all k for w kl 1 is equal to summing over all l for w lk 2 which gives us w kl 1 = w lk 2 = ζ k /2. Since the sum over all k for w kl 1 is equal to one, then ζ k = 2/m. Therefore, we are also not interested in this solution.
When ρ = ±1/2, we may apply the weight constraints k w kl 1 and l w lk 2 on Equations 20, 21, 23 and 24 to find that where k and l are such that w kl 1 and w lk 2 are variable. If the term in the brackets is not 0, then ζ k = ξ l . We may then apply the weight constraint k w kl 1 on Equations 20 and 23 again to obtain ζ k m − 1 2 + 1 1 + 2ρ = 1.
We can conduct the same analysis for Equations 21 and 24 to find ξ l . Hence, it follows that ζ k = ξ l = ζ, ζ = 4ρ(1 + 2ρ) 2ρ + m − 1 + 2ρm , and Combining Equations 27, 28 and 29 results in two solutions for S, and for dS/dC respectively, We are not interested in Equation 30 and 32 because dS/dC is negative. When S = −2(2m − 1)/m, then C = 1/m at the minimum and dS/dC = 2. As S is decreased further, dS/dC > 2 by Equation 33. For this solution, the weight elements of each column in W 1 and W 2 also sum to one. We are now left with the case when the term in the brackets in Equation 25 is 0. When this term is 0, then ρ = m − 1 2(m + 1) .
For any k where w kl 1 and w lk 2 are both variable, let l = r(k). Therefore, r(k) is a bijective function. Then C , S and the weight constraint are respectively, and Combining Equations 35, 36 and 37, we obtain for S and dS/dC , and dS dC = 4 m − 1 + 2. (39)