Trends in Ecology & Evolution
ReviewSo Many Variables: Joint Modeling in Community Ecology
Section snippets
A New Phase for Community Modeling in Ecology
Many of the questions posed in ecology require the consideration of abundance (see Glossary, including presence/absence) collected simultaneously across multiple taxonomic groups, for example species. The abundances in different taxa typically form the response variables in a multivariate analysis and are analyzed for several different goals, recent examples include: to study the impact of experimental removal of invasive crayfish on macroinvertebrate communities [1], to find taxa that can act
Joint Models for Abundance
The methods described in this paper are all extensions of the generalized linear model (GLM) [19], widely used to model abundance (e.g., 20, 21, 22). A joint model necessarily requires the inclusion of random effects, hence some form of mixed model [23], to capture correlation in abundance across taxa. There are several ways to proceed, and a key issue to consider is the level of complexity in the model. A balance needs to be found between using a sufficiently simple model that its parameters
Modeling Residual Correlation Between Taxa
An important application of joint models is in estimating the correlation between taxa that arises for reasons not attributable to the measured predictors included in the model. Such correlation could be due to biotic interactions such as competition and facilitation, although the exact type of biotic interaction cannot be inferred from co-occurrence 4, 54. It could also be due to joint response to unmeasured predictors, or to other forms of misspecification of the mean model [26].
If the number
Model-Based Ordination
By treating latent variables as ordination axes, a LVM (commonly with two latent variables) can be understood as a model-based approach to unconstrained ordination 32, 33. A model-based approach to ordination offers several advantages over traditional ordination methods. For example, models can be used to account for important (and otherwise spurious) data properties such as the mean–variance relationship [56]. Model selection and residual analysis tools can be used to verify key aspects of a
Multivariate Inferences about Predictors
Joint models, whether GLMMs or LVMs, can be used to make multivariate inferences about the effect of the predictor variables xi, while accounting for any residual correlation between taxa. Accounting for correlation between taxa, and doing so in a flexible way, is important to ensure that inferences made jointly across multiple taxa are statistically valid. Two examples of this are when studying how well species traits explain interspecific variation in environmental response (Box 2) and when
Accounting for Missing Predictors
While diagnostic tools can be used to check assumptions, one can never be sure that all assumptions in the mean model are correct, and some violations remain hard to detect. One or more important predictors could be missing from the study, or perhaps the form in which the measured predictors enter the model is incorrect (e.g., assuming a quadratic response when the true response is more complex). The statistical term for such failures is ‘misspecification’ of the mean model [66]. Fortunately,
Improving Predictions
When predicting abundance across a set of correlated taxa, joint models could improve predictive performance even if the model were correctly specified.
Joint models have a particular advantage for in-sample prediction because they can make use of correlations across taxa, which contain information useful for predicting abundance of one taxon from others. For example, when using a LVM, if predictions are made on the same samples that were used to fit the joint model, then one can condition on
Concluding Remarks
Joint models are flexible tools with exciting potential for application in ecology, especially community ecology, where the number of taxa is rarely small compared to the number of samples. In such instances a latent variable approach can be used for a range of purposes, as discussed here, although this list is by no means exhaustive.
Both multivariate GLMMs and LVMs can be understood as special types of mixed effects models designed for multivariate data. Hence they can be used for much the
Acknowledgments
D.I.W was supported by an Australian Research Council Future Fellowship (FT120100501) and an Australian Academy of Science travel grant. B.O’H. was supported by a LOEWE (Landes-Offensive zur Entwicklung Wissenschaftlich-ökonomischer Exzellenz) initiative of the Hessian Ministry for Science and the Arts. O.O. and S.T. were supported by Academy of Finland grants 250444 and 251965, respectively. F.K.C.H. was supported by Australian Research Council discovery project grant DP140101259. We thank the
Glossary
- Abundance
- the extent to which a type of organism is present in a sample unit, measured either as a count, biomass, % cover, a factor with ordered levels, or presence/absence.
- Continuous variable
- a variable that can take any value within some interval (cf. discrete variable). Abundance is rarely continuous, complicating the modeling process.
- Discrete variable
- a variable that can take one of a countable number of distinct values. Abundance is often discrete, for example counts could be 0, 1, 2, 3,...
References (92)
DNA barcoding for ecologists
Trends Ecol. Evol.
(2009)From barcoding single individuals to metabarcoding biological communities: towards an integrative approach to the study of global biodiversity
Trends Ecol. Evol.
(2014)- et al.
A theory of gradient analysis
Adv. Ecol. Res.
(1988) Generalized linear mixed models: a practical guide for ecology and evolution
Trends Ecol. Evol.
(2009)Inferring biotic interactions from proxies
Trends Ecol. Evol.
(2015)Mvabund – an R package for model-based analysis of multivariate abundance data
Methods Ecol. Evol.
(2012)Model based grouping of species across environmental gradients
Ecol. Model.
(2011)Rebuilding community ecology from functional traits
Trends Ecol. Evol.
(2006)Intensive removal of signal crayfish (pacifastacus leniusculus) from rivers increases numbers and taxon richness of macroinvertebrate species
Ecol. Evol.
(2014)Selective-logging and oil palm: multitaxon impacts, biodiversity indicators, and trade-offs for conservation planning
Ecol. Appl.
(2014)
A continental-scale analysis of feral cat diet in australia
J. Biogeogr.
Microbial interactions: from networks to models
Nat. Rev. Microbiol.
Numerical Ecology
Penalized normal likelihood and ridge regularization of correlation and covariance matrices
J. Am. Stat. Assoc.
Model-based thinking for community ecology
Plant Ecol.
Multivariate Analysis in Community Ecology
Relating behavior to habitat: solutions to the fourth-corner problem
Ecology
A new method for non-parametric multivariate analysis of variance
Aust. Ecol.
Spatial modelling of biodiversity at the community level
J. Appl. Ecol.
Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splines
Divers. Distrib.
SESAM: a new framework integrating macroecological and species distribution models for predicting spatio-temporal patterns of species assemblages
J. Biogeogr.
Mapping beta diversity from space: Sparse generalised dissimilarity modelling (SGDM) for analysing high-dimensional data
Methods Ecol. Evol.
Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling
Ecol. Appl.
An Introduction to Generalized Linear Models
Do not log-transform count data
Methods Ecol. Evol.
The arcsine is asinine: the analysis of proportions in ecology
Ecology
Ecotoxicology is not normal
Environ. Sci. Pollut. Res.
Selecting traits that explain species–environment relationships: a generalized linear mixed model approach
J. Vegetation Sci.
Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions
Ecology
Towards novel approaches to modelling biotic interactions in multispecies assemblages at large spatial extents
J. Biogeogr.
Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM)
Methods Ecol. Evol.
More than the sum of the parts: forest climate response from Joint Species Distribution Models
Ecol. Appl.
Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models
Latent Variable Models and Factor Analysis: A Unified Approach
Computational analysis of microarray data
Nat. Rev. Genet.
Random-effects ordination: describing and predicting multivariate correlations and co-occurrences
Ecol. Monogr.
Model-based approaches to unconstrained ordination
Methods Ecol. Evol.
Objective methods for the classification of vegetation. iii. an essay in the use of factor analysis
Aust. J. Bot.
Data Analysis in Community and Landscape Ecology
Fine-scale hydrological niche differentiation through the lens of multi-species co-occurrence models
J. Ecol.
Structural Equation Modeling and Natural Systems
Relative roles of ecological and energetic constraints, diversification rates and region history on global species richness gradients
Ecol. Lett.
Sparse Bayesian infinite factor models
Biometrika
Longitudinal data analysis using generalized linear models
Biometrika
Regularized sandwich estimators for analysis of high-dimensional data using generalized estimating equations
Biometrics
Generalized Estimating Equations (GEE)
Cited by (538)
Pólya-splitting distributions as stationary solutions of multivariate birth–death processes under extended neutral theory
2024, Journal of Theoretical BiologyEcological niche modelling
2024, Current BiologyNovel community data in ecology-properties and prospects
2024, Trends in Ecology and EvolutionHow non-target chironomid communities respond to mosquito control: Integrating DNA metabarcoding and joint species distribution modelling
2024, Science of the Total EnvironmentBenthic habitat mapping: A review of three decades of mapping biological patterns on the seafloor
2024, Estuarine, Coastal and Shelf ScienceDistribution of rhodolith beds and their functional biodiversity characterisation using ROV images in the western Mediterranean Sea
2023, Science of the Total Environment