Microbial communities as dynamical systems

Nowadays, microbial communities are frequently monitored over long periods of time and the interactions between their members are explored in vitro . This development has opened the way to apply mathematical models to characterize community structure and dynamics, to predict responses to perturbations and to explore general dynamical properties such as stability, alternative stable states and periodicity. Here, we highlight the role of dynamical systems theory in the exploration of microbial communities, with a special emphasis on the generalized Lotka–Volterra (gLV) equations. In particular, we discuss applications, assumptions and limitations of the gLV model, mention modiﬁcations to address these limitations and review stochastic extensions. The development of dynamical models, together with the generation of time series data, can improve the design and control of microbial communities.


Introduction
Microbial communities are not static over time; abundances of members fluctuate from one measured time point to the next, sometimes drastically so. Longitudinal studies of host-associated and environmental microbiota have revealed several cases of complex dynamics, including periodicities, chaos and alternative stable states [1,2 ], as reviewed in [3]. Moreover, thanks to advances in sequencing techniques, the number, length, and resolution of microbial community time series are all increasing rapidly; time series may cover a year or more, with monthly, weekly or even daily sampling intervals [1, 4,5]. However, while we are gaining an ever more detailed picture of the composition and dynamics of many microbial communities, we still understand little of the rules that govern how these communities change over time.
Dynamical systems theory is a well-developed branch of mathematics that describes the change of complex systems such as microbial communities over time, and which is now increasingly applied to sequencing data [6][7][8]. In brief, the time development of a dynamical system, here the species composition of the community, can be described by a set of ordinary differential equations (ODEs) that encode the rules according to which the system changes. In some cases, prior biological knowledge of the system is sufficient to formulate these rules, while in others they can be derived from time series data. Dynamical systems theory highlights the conditions for the emergence of complex behavior and provides rigorous definitions of stability (see Box 1). The development and analyses of dynamical models thus allow microbial ecology to go beyond simple descriptions of community composition and statistical correlations, towards a better understanding of community dynamics. connected by arrows correspond to the trajectory of the system through the phase space (or community space, for simplicity). The community's movement through community space can be analyzed to test for instance whether the community moves randomly or tends towards a certain direction, or whether community composition changes more strongly during perturbation periods, implying larger jumps through community space. Figure 1 illustrates trajectories for ocean and gut microbial communities, which provide examples of periodic behavior (Figure 1a) as well as the tendency to remain in the same region of the community space, suggesting the presence of a stable state (Figure 1b). While the gut community in Figure 1b returns to its stable state after a perturbation, thereby demonstrating resilience, a gut community from another person appears to switch to a second state upon perturbation (Figure 1c). While the phase space plot visualizes attractors such as stable states, alternative plots such as recurrence plots and periodograms serve to explore other aspects of the behavior of dynamical systems, such as periodicity.
Recently, the Anna Karenina principle (AKP) of dysbiotic communities has been put forward [15,16 ], which states that dysbiotic communities tend to exhibit greater intersubject variability than the 'reference' community in healthy hosts. The observation that perturbation periods contain larger jumps than are present before the perturbation ( Figure 1d) suggests a dynamical formulation of the AKP, where perturbed communities tend to vary more strongly over time than healthy communities. Although experimental evidence supports this dynamic version of the AKP [17 ], it remains to be tested systematically.
What can we learn from community models?
Mathematical approaches provide the means for systematic quantitative characterization of observed patterns and their underlying mechanisms, and have a long-standing history in ecology (e.g. [18,19]). While a number of models have been proposed in microbial community ecology (reviewed e.g. in [20]), we will discuss in particular the generalized Lotka-Volterra (gLV) model, since it has become one of the most popular microbial community models to date (e.g. [6,21]).
The gLV model is a classical ordinary differential equations (ODEs) model that characterizes the dynamics of a multi-species system. It describes the change over time of a population of N species as a function of their intrinsic growth rates and the interactions between species (see the supplement). Interactions can be unidirectional (species i affects species j, but not the other way round; e.g. commensalism) or reciprocal (species i affects species j and vice versa; e.g. competition and parasitism). Together, these interactions encode the community network. Thus, the gLV model can capture a number of commonly encountered network structures, including food chains, modularity, scale-freeness and small-world networks (e.g. in [22,23]).
The utility of the gLV model for studying microbial communities is twofold: it offers a convenient tool to interpret existing empirical data, and provides a framework to make broader predictions about the factors that govern microbial communities' stability and dynamics. An increasing number of methods have been developed with which to fit gLV models to large-scale longitudinal (and in certain cases, cross-sectional) data [7,8,24,25]. In these studies, observed data are used to assign values to the intrinsic growth rates and interspecies interactions associated with each member of a microbial community. Thus, one can determine not only how each species' abundance changes over time, but how this change is influenced by each of the other members of the community. Through learning these parameters, researchers have been able to identify members of microbial communities that play important roles both for broad scale community properties (e.g. keystone species that influence many other community members [7]) and for Box 1 What does 'stability' mean?
Many theoretical works in ecology focus on 'stability'. However, the definition of this property varies from author to author. It is therefore important to distinguish them: Linear asymptotic stability: In the theory of non-linear dynamical systems, stability is determined by the behaviour of the system in response to a small, punctual perturbation in the variables (i.e. a change in abundance of some species in the present context) (see e.g. [31,65]). If, following the perturbation, the composition of a microbial community returns to its initial (steady) state, this state will be stable. In contrast, if the perturbation amplifies (meaning that the system diverges from its initial state), this state is unstable. In addition, stability is a local property: it does not imply anything about the long-term behaviour of the system and may not be valid for large perturbations. Persistence/permanence: These definitions of stability pertain to the long-term behaviour of a community. Both imply that a community will always maintain the species it started with, regardless of the size of the perturbation and even if it does not return to its original state [66], however, permanence is the stricter definition requiring that the boundary of the state space is a repeller: meaning if ever any species density approaches too close to zero, it will again begin to grow-and thus formally no species can ever go extinct [67]. Temporal stability and robustness: Stability of an ecological system is sometimes assessed by the level of variability displayed by the community over time [68]. This variability may be attributed to stochasticity. A related definition refers to how much the composition of a system depends on small environmental changes, which are typically taken into account in the model parameters. If the community tends to remain constant over time or across parameter changes, it will be considered more stable. On the contrary, if the abundance of some species is sensitive to parameter changes, the system is considered less stable. This concept of stability is a measure of robustness -or resistance -to noise and to parameter values, and can be quantified by sensitivity analyses (parameters) or stochastic simulations (noise). Structural stability: A system is more structurally stable than another if its dynamical behaviour (e.g. the coexistence between several species) is maintained over a larger range of parameter values [69,70].
Fitting gLV models to large scale community data is particularly useful when studying diverse natural communities where it is challenging to manipulate the communities themselves or to culture individual members -as is often the case for host-associated microbiotas. However, such approaches typically have strict sampling requirements. They assume that measurements have been    shows the trajectory of the gut community of individual A before and after perturbation by traveling and diarrhea [5]. The perturbation periods are marked in red, labels refer to days since the start of the measurements. The community stays within a narrow region of community space until pushed away by the first perturbation. After the perturbations, the community returns to this region (stable state).

PCoA English Channel
In panel (c), the trajectory of the gut community of individual B is shown [5], with a Salmonella infection. Highlighted in red and labels referring to days since the start of the measurements. The perturbation in this case appears to push the community towards a second state. Panel (d) depicts a box plot of the jump lengths for the gut community of individual A. Although the average jump length during the perturbation periods is not significantly larger, the perturbed jump length distribution has a higher standard deviation, since it contains the longest jumps. All data were rarefied (to 4623 reads for the English Channel and 10 000 reads for the stool data). Stool data were in addition interpolated with the stinepack R package (function stineman) to equalize sampling intervals. Small negative values resulting from interpolation were set to zero. Because of a long sampling gap, the last sample for the stool data set B was omitted.
taken regularly and are sufficiently frequent to resolve the dynamics [26]. For the parameterization of the discrete gLV, Fisher and Mehta recommend that the sample number should be at least equal to the square of the taxon number [7]. Moreover, in practice, only the most abundant taxa are selected for dynamical modeling, potentially missing the role of important but low abundance community members [7,8]. Alternatively, Cao and colleagues suggest to parameterize the gLV from several short time series collected for different initial conditions [27]. However, these approaches will fail to capture any features driving community change that are not observed, for example, environmental change.
An alternative approach can be applied when studying simpler, easily cultured communities. Here each interspecies interaction is parameterized individually, for example through culturing each community member in isolation and in pairwise co-cultures. Once the gLV parameters have been determined, these can be used to simulate microbial dynamics in silico -for example to predict how more complex communities might behave. Such approaches have been used in order to predict community stability and invasibility [28 ], and to study the role of higher order interactions within microbial communities [29,30].
The above examples outline methods by which gLVs can be used to inform studies of specific empirically observed microbial communities. However, gLV models can also generate broader predictions concerning general microbial community dynamics. In this approach a community property of interest is defined mathematically, as is the correspondence between changes to the biology of a microbial community and changes to the parameters of the gLV model. By systematically screening the parameter values of the gLV model one can then determine how each of the features of a microbiota individually influences the community property of interest, such as stability (see Box 1 for a few definitions of stability and Figure 2a for their illustration). Theoretical approaches and numerical simulations can then be used to study for example the impact of the species number and the interaction density (connectivity) on the survival rate (Wigner-May theorem, Figure 2b), or how interspecies cooperation affects microbiome stability [31,32] ( Figure 2c). Moreover, such theoretical insights are not solely limited to properties of the communities themselves; systematically varying and simulating gLVs has also been used as a method to assess the validity of different methodological approaches for studying complex microbial communities [13,21,33,34].

Assumptions and limitations of the generalized Lotka-Volterra model
While the gLV model offers a range of applications for studying microbial communities, it also has a number of key limitations. First, it can only describe pair-wise interactions and thus fails to capture modulating effects that a third species may have on an interacting pair, for example, when a cross-feeding relationship between two species is weakened by the production of the exchanged metabolite by a third species. Second, it does not take immigration from surrounding communities or environmental effects such as variable temperature or spatial structure into account. Third, it assumes that populations are homogeneous. Fourth, interaction strengths are supposed not to change over time. Fifth, interaction strengths are assumed to be additive, which implies that the gLV model does not differentiate well between an interaction that is crucial for survival of a species and one that merely boosts its growth. Finally, interactions are assumed to be bilinear, that is, the growth rate of a species will change proportionally to the abundance of its interaction partner. However, in the real world, the growth rates often saturate. For instance, when the concentration of a substrate produced by a cross-feeding partner is very high, the enzyme activity rather than the substrate concentration limits the rates of biochemical reactions. At that point, a higher abundance (implying a higher substrate concentration) will no longer benefit the interaction partner.
These limitations likely explain why not all experimental time series are well described by the gLV model. For instance, some time series show greater variability or even a shift to another community state after a short (transient) perturbation (e.g. [35]). However, when using the gLV to simulate the response of a large community at steady state to a transient perturbation that does not kill any of the species, the community typically returns monotonously (i.e. without fluctuations) to its initial state (Box 1 and Figure 3a).
A number of extensions have been proposed to overcome these limitations. Momeni and colleagues proposed adaptations of the gLV equations to better capture crossfeeding [36 ]. More precisely, they suggest modifications to make the gLV model consistent with mechanistic models that explicitly describe the interaction mechanism through exchanges of chemical compounds. Haegeman and Loreau added an immigration term to the gLV [37], whereas Stein and colleagues included a term to model the response to an antibiotic perturbation [8]. Dam and colleagues take the impact of the environment on lake communities into account by multiplying growth rates with environmental variables [38 ]. In contrast to gLV-based models, individual-based modeling (IBM) describes communities at the level of individuals instead of populations, which allows to relax the assumption of homogeneous populations and to take spatial structure into account (e.g. [31,39]).
Although sets of parameter values leading to complex dynamics such as chaos (characterized by an apparent erratic dynamics and a sensitivity on initial conditions) can be found in small-scale systems (e.g. [40,41]), those behaviors occur very rarely in arbitrarily parameterized large-scale systems. Moreover, the presence of multi-stability (the existence of more than one stable state where all species coexist in the same conditions) was never reported in the standard gLV model. However, when relaxing the linearity assumption by introducing non-linear growth functions, the gLV can simulate higher variability after a perturbation (Figure 3b). Another variant of the gLV model with multiplicative interaction terms can be parameterized to exhibit multistability and perturbation-induced switches between alternative stable states (Figure 3c) [42].
While the non-linear variants of gLV capture complex behavior better than the classical gLV, they also come with more parameters and are therefore harder to fit to the data and to analyse. For instance, to model the interaction mechanism on the molecular level, knowledge about exchanged metabolites and their kinetic parameters is required, which    Concepts of stability and the impact of community properties on stability. Stability is often considered to be critical for microbial community function, particularly in host-associated microbiotas [68]. (a) To understand the factors that drive community stability one must first define 'stability' of an ecosystem. We review different definitions of stability in Box 1, and illustrate them here. In dynamical system theory a commonly considered metric is linear asymptotic stability, which asks whether a community will return to its original steady state following a small perturbation. Other definitions such as permanence or persistence require that a community will always maintain the species it started with. Temporal stability quantifies variability across time. Finally, a community is more structurally stable than another if its species coexist for a greater range of parameter values. (b) The species number and the connectivity (density) of the interactions affect the empirical persistence of the community, calculated here as the fraction of the initial species in a community that survive throughout a gLV simulation. The random interaction strengths are drawn from a Normal distribution of mean 0 and standard deviation 0.2. The survival rate is greater when there are fewer species or less densely connected species. The black line indicates the theoretical transition to instable communities [18,65]. The color code reflects the proportion of surviving species. (c) The ratio of cooperative versus competitive interactions affects the (linear asymptotic) stability of the non-trivial steady state (associated to the coexistence of all species). Here we gradually increase the number of cooperative interactions within an otherwise competitive host-associated community. Stability, here quantified as Àl (where l is the leading eigenvalue of the matrix representing all pair-wise interactions), decreases with increasing cooperation [31].
can be derived from metabolite concentrations followed over time. In general, when selecting a model, one must strike a balance between realism and the ability to systematically and comprehensively analyse the system of interest.
The gLV and its variants illustrate how rich and complex dynamical behavior may emerge from simple pairwise interactions between community members. If species interactions are not relevant for community dynamics in a particular community, the neutral model [43,44] may be considered. The neutral model assumes that the species are ecologically equivalent in terms of fitness, and the observed fluctuations in species abundances can be explained by purely stochastic variation. In the neutral model, interactions with other species are not needed to characterize changes in species abundance, and the dynamics of each species is governed essentially by stochastic variation proportional to species abundance and exchanges with a meta-population. The neutral model can be used as a null model for testing alternative models of community variation or as an approximation for community dynamics; it can thus serve to study the relative importance of various potential ecological mechanisms in driving community variation [45]. Most of the available implementations for testing the neutral hypothesis have been designed for cross-sectional observations (e.g. [46]), but a novel covariance-based method for longitudinal time series was proposed recently [47].
Whereas species variation in the neutral model is stochastic, the gLV model is deterministic, that is, the future community state is entirely determined by the preceding community state. We will next discuss ways to account for stochastic behavior.  [42]. In this simulation, the perturbation induces a switch to another community state that remains after the end of the perturbation. (d) Principle of multi-stability: A perturbation that increases a control parameter such as a growth rate beyond a tipping point (black circle) induces a community switch. The original community state can only be restored when the control parameter is lowered below another tipping point. If the control parameter is only lowered to its original value, the alternative state is preserved (hysteresis). Details about the simulations are given in the supplement.

Accounting for uncertainty
Uncontrolled biological and technical variation is inherently present in experimental community data. This can pose considerable challenges for modeling. Often only a subset of the relevant variables can be directly observed, and a limited sample size provides insufficient information of the underlying generative processes. Stochastic factors and measurement errors add uncertainty to modeling and inference. Explicitly modeling the uncertainties can lead to more accurate inference and predictions when the model assumptions, such as symmetry or overdispersion, are valid. For instance, if symmetric noise models are used for skewed data, parameter estimates can be biased [48,49]. Furthermore, certain models, such as the Gamma-Poisson (negative binomial) or Dirichlet-Multinomial [15], can perform model averaging, which can also improve robustness and predictive performance [50]. Hence, although modeling of uncertainties is often seen as a tool for obtaining more accurate confidence intervals, it can be critical for unbiased modeling and predictions.
Stochastic differential equation (SDE) models constitute a class of differential equation-based models that integrate stochastic processes into the deterministic description. Stochastic processes provide the means to characterize richer stochastic variation, where the values are not independent but exhibit a specific dependency structure over time. In SDE models, one or more of the terms are stochastic processes. The Wiener process is a common example, which provides a model for Brownian motion [51,52]. The values of a Wiener process are not independent, but the difference between two time points is normally distributed with a variance that increases linearly with time. Other examples of stochastic processes include the mean-reverting Ornstein-Uhlenbeck Process, which tends to return towards a mean value [53] and the self-exciting Hawkes Process, where the occurrence of an event will temporarily increase the chance of further events [54]. Stochastic processes provide tools to characterize structured variation in complex ecological systems where some of the underlying mechanisms may be unknown, and can hence provide useful extensions to purely mechanistic models. Their application has a long tradition in physics and finance, and provides new opportunities for ecological modeling.
Recent applications include models for microbial extinction events [52] and migration [55]. The study by Schroeder et al. [55] first employs a discrete model that combines selection and migration to describe the growth of small populations in drinking water pipes, and then shows how a continuous stochastic process model naturally follows at the limit of large sample sizes.
Further uncertainties arise from limited sample numbers and measurement errors. The Bayesian probabilistic framework provides tools to incorporate prior information in the models and address uncertainties in modeling and inference [56], and it has been applied for instance in the modeling of influenza transmission [57]. A Gaussian Process regression approach was recently proposed for non-parametric modeling of microbial growth curves under environmental perturbations [58 ]. Interestingly, this approach allows to quantify the differential effects of environmental perturbations on microbial growth across a large compendium of genotypes in archaea and yeast and to identify transcriptional regulators of microorganism growth under standard and stress conditions. The application of SDEs and Bayesian inference in microbial ecology remains limited, partially due to the increased efforts that may be required to construct, apply, and interpret such models. Nevertheless, new practical tools for Bayesian inference in Lotka-Volterra and other standard ODE models have recently become available [59]. This provides promising opportunities not only for modeling the uncertainties, but also for incorporating prior information and joint modeling of multiple information sources based on hierarchical latent variable models [60]. Such extensions will, however, come with additional challenges in model formulation and parameterization, and increased computational cost.

Conclusions/outlook
Deterministic and stochastic dynamical models provide rigorous quantitative frameworks to characterize the rich and complex patterns of variation observed in microbial communities. However, a number of challenges remain to be solved to improve microbial community models. For small-scale communities, combining models of species interactions and dynamics with information about their metabolism can improve the prediction accuracy (e.g. [61,62]). Likewise, incorporating information on spatial structure and individual variation can improve the models, as these are important factors shaping community dynamics (e.g. [39,63,64]). However, it is not easy to scale up these models to describe larger communities. Furthermore, microbial communities are not isolated systems; immigration from surrounding communities as well as the environment are important factors with potentially drastic influence on the model performance. Finally, the community members themselves, and their mutual interactions, may evolve over time and alter the environment they inhabit. Hence, a variety of challenges remain to be solved to arrive at more realistic, predictive, and practically applicable community models.

36.
Momeni B, Xie L, Shou W: Lotka-Volterra pairwise modeling fails to capture diverse pairwise microbial interactions. eLIFE 2017, 6:e25051. Momeni and colleagues systematically compare the gLV models of pairwise interactions with corresponding mechanistic models, where compounds mediating the interaction are taken into account explicitly. They discuss in-depth the conditions in which the gLV models fail to describe the dynamics of pair-wise interactions correctly.

38.
Dam P, Fonseca LL, Konstantinidis KT, Voit EO: Dynamic models of the complex microbial metapopulation of lake mendota. npj Syst Biol Appl 2016, 2:16007. Dam and colleagues explore the dynamics of lake communities with a modified gLV model that takes environmental conditions into account. Their model, parameterized directly from the time series, captures well the observed dynamics of taxa grouped together according to their peak abundance time. While most interactions between taxon groups are negative, the effect of environmental conditions was found to be mostly positive. The authors also explore how the inferred interaction networks change across seasons.