Abstract
A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence–absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence–absence matrix. We evaluated the performance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence–absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated “checkerboard” species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species.
Similar content being viewed by others
References
Abbott I, Black R (1980) Changes in species composition of floras on islets near Perth, Western Australia. J Biogeogr 7:399–410
Atmar W, Patterson BD (1995) The nestedness temperature calculator: a visual basic program, including 294 presence absence matrices. AICS Research Incorporate and The Field Museum. http://www.aics-research.com/nestedness/tempcalc.html
Bacallado JJ (1976) Notas sobre la distribucion y evolucion de la avifauna Canaria. In: Kunkel G (ed) Biogeography and ecology in the Canary Islands. Junk, The Hague, pp 13–431
Beard JS (1948) The natural vegetation of the Windward and Leeward Islands. Oxford For Mem 21:1–192
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188
Brown JH, Kurzius MA (1987) Composition of desert rodent faunas: combinations of coexisting species. Ann Zool Fenn 24:227–237
Burnham KP, Anderson DR (2002) Model selection and inference: A practical information-theoretic approach. Springer, New York
Burns KC (2007) Patterns in the assembly of an island plant community. J Biogeogr 34:760–768
Cameron RAD (1992) Land snail faunas of the Napier and Oscar Ranges, Western Australia; diversity, distribution and speciation. Biol J Linn Soc 45:271–286
Colwell RK, Winkler DW (1984) A null model for null models in biogeography. In: Strong Jr, Simberloff D, Abele LG, Thistle AB (eds) Ecological communities: conceptual issues and the evidence. Princeton University Press, Princeton, pp 344–359
Connor EF, Simberloff D (1979) The assembly of species communities: chance or competition? Ecology 60:1132–1140
Crowe TM (1979) Lots of weeds. J Biogeogr 6:169–181
Descimon H (1986) Origins of Lepidopteran faunas in the high tropical Andes. In: Vuilleumier F, Monasterio M (eds) High altitude tropical biogeography. Oxford University Press, Oxford, pp 500–532
Diamond JM (1975) Assembly of species communities. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 342–444
Diamond JM, Gilpin ME (1982) Examination of the “null” model of Connor and Simberloff for species co-occurrences on islands. Oecologia 52:64–74
Diamond JM, Marshall AG (1976) Origin of the New Hebridean avifauna. Emu 76:187–200
Efron B (2005) Bayesians, frequentists, and scientists. J Am Stat Assoc 100:1–5
Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606–2621
Gotelli NJ (2001) Research frontiers in null model analysis. Glob Ecol Biogeogr 10:337–343
Gotelli NJ, Ellison AE (2004) A primer of ecological statistics. Sinauer, Sunderland
Gotelli NJ, Entsminger GL (2001) Swap and fill algorithms in null model analysis: rethinking the knight’s tour. Oecologia 129:281–291
Gotelli NJ, Entsminger GL (2003) Swap algorithms in null model analysis. Ecology 84:532–535
Gotelli NJ, Graves GR (1996) Null models In ecology. Smithsonian Institution Press, Washington
Gotelli NJ, McCabe DJ (2002) Species co-occurrence: a meta-analysis of J. M. Diamond’s assembly rules model. Ecology 83:2091–2096
Haila Y, Järvinen O, Vaisanen RA (1980) Habitat distributions and species associations of land bird populations on the Aland Islands, SW Finland. Ann Zool Fenn 17:87–106
Hatt RT, Van Tyne J, Stuart LC, Pope CH, Grobman AB (1948) Island life: a study of the land vertebrates of the islands of eastern Lake Michigan. Cranbrook Institute of Science. Bloomfield Hills, MI
Higgins CL, Willig MR, Strauss RE (2006) The role of stochastic processes in producing nested patterns of species distributions. Oikos 114:159–167
Hocutt CH, Denoncourt RF, Stauffer JR (1978) Fishes of the Greenbriar River, West Virginia, with drainage history of the Central Appalachians. J Biogeogr 5:59–80
Kammenga JE, Herman MA, Ouborg NJ, Johnson L, Breitling R (2007) Microarray challenges in ecology. Trends Ecol Evol 22:273–279
Lehsten V, Harmand P (2006) Null models for species co-occurrence patterns: assessing bias and minimum iteration number for the sequential swap. Ecography 29:786–792
Manly BFJ (1991) Randomization and Monte Carlo methods in biology. Chapman and Hall, London
Manly BFJ (1995) A note on the analysis of species co-occurrences. Ecology 76:1109–1115
May RM (1975) Patterns of species abundance and diversity. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 81–120
McCoy ED, Heck KL (1987) Some observations on the use of taxonomic similarity in large-scale biogeography. J Biogeogr 14:79–87
Miklós I, Podani J (2004) Randomization of presence–absence matrices: comments and new algorithms. Ecology 85:86–92
Moran MD (2003) Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 100:403–405
Murphy RW (1983) The reptiles: origins and evolution. In: Case TJ, Cody ML (eds) Island biogeography in the Sea of Cortez. University of California Press, Berkeley, pp 130–158
Patterson BD (1987) The principle of nested subsets and its implications for biological conservation. Conserv Biol 1:323–334
Patterson BD, Atmar W (1986) Nested subsets and the structure of insular mammalian faunas and archipelagos. In: Heaney LR, Patterson BD (eds) Island biogeography of mammals. Academic Press, London, pp 65–82
Patterson BD, Pacheco V, Solari S (1996) Distributions of bats along an elevational gradient in the Andes of south-eastern Peru. J Zool 240:637–658
Sanderson JG (2000) Testing ecological patterns. Am Sci 88:332–339
Sanderson JG (2004) Null model analysis of communities on gradients. J Biogeogr 31:879–883
Schluter D, Grant PR (1984) Determinants of morphological patterns in communities of Darwin’s finches. Am Nat 123:175–196
Sfenthourakis S, Giokas S, Tzanatos E (2004) From sampling stations to archipelagos: investigating aspects of the assemblage of insular biota. Glob Ecol Biogeogr 13:23–35
Sfenthourakis S, Tzanatos E, Giokas S (2006) Species co-occurrence: the case of congeneric species and a causal approach to patterns of species association. Glob Ecol Biogeogr 15:39–49
Shipley B (2002) Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference. Cambridge University Press, Cambridge
Simberloff D, Connor EF (1979) Q-mode and R-mode analyses of biogeographic distributions: null hypotheses based on random colonization. In: Patil GP, Rosenzweig ML (eds) Contemporary quantitative ecology and related ecometrics. International Cooperative Publishing House, Fairland, pp 123–138
Simberloff D, Connor EF (1981) Missing species combinations. Am Nat 118:215–239
Springer VG (1982) Pacific plate biogeography, with special reference to shorefishes. Smithsonian Institution, Washington
Stone L, Roberts A (1990) The checkerboard score and species distributions. Oecologia 85:74–79
Sutherland JP, Karlson RH (1977) Development and stability of the fouling community at Beaufort, North Carolina. Ecol Monogr 47:425–446
Ulrich W (2004) Species co-occurrences and neutral models: reassessing J. M. Diamond’s assembly rules. Oikos 107:603–609
Ulrich W (2008) Pairs—a FORTRAN program for studying pair-wise species associations in ecological matrices. www.uni.torun.pl/~ulrichw
Ulrich W, Gotelli NJ (2007a) Null model analysis of species nestedness patterns. Ecology 88:1824–1831
Ulrich W, Gotelli NJ (2007b) Disentangling community patterns of nestedness and species co-occurrence. Oikos 116:2053–2061
Ulrich W, Zalewski M (2006) Abundance and co-occurrence patterns of core and satellite species of ground beetles on small lake islands. Oikos 114:338–348
Veech JA (2006) A probability-based analysis of temporal and spatial co-occurrence in grassland birds. J Biogeogr 33:2145–2153
Zaman A, Simberloff D (2002) Random binary matrices in biogeographical ecology—instituting a good neighbor policy. Environ Ecol Stat 9:405–421
Acknowledgments
This work was supported in part by a grant from the Polish Ministry of Science to W. U. (KBN, 2 P04F 039 29). N. J. G. was supported by NSF grant 0541936. We thank Aaron Ellison for calling our attention to Efron (2005). The manuscript was improved by comments from two anonymous reviewers and associate editor W. M. Mooij.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Wolf Mooij.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gotelli, N.J., Ulrich, W. The empirical Bayes approach as a tool to identify non-random species associations. Oecologia 162, 463–477 (2010). https://doi.org/10.1007/s00442-009-1474-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00442-009-1474-y