Skip to main content
Log in

The empirical Bayes approach as a tool to identify non-random species associations

  • Community ecology - Original Paper
  • Published:
Oecologia Aims and scope Submit manuscript

Abstract

A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence–absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence–absence matrix. We evaluated the performance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence–absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated “checkerboard” species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Abbott I, Black R (1980) Changes in species composition of floras on islets near Perth, Western Australia. J Biogeogr 7:399–410

    Article  Google Scholar 

  • Atmar W, Patterson BD (1995) The nestedness temperature calculator: a visual basic program, including 294 presence absence matrices. AICS Research Incorporate and The Field Museum. http://www.aics-research.com/nestedness/tempcalc.html

  • Bacallado JJ (1976) Notas sobre la distribucion y evolucion de la avifauna Canaria. In: Kunkel G (ed) Biogeography and ecology in the Canary Islands. Junk, The Hague, pp 13–431

    Google Scholar 

  • Beard JS (1948) The natural vegetation of the Windward and Leeward Islands. Oxford For Mem 21:1–192

    Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300

    Google Scholar 

  • Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188

    Article  Google Scholar 

  • Brown JH, Kurzius MA (1987) Composition of desert rodent faunas: combinations of coexisting species. Ann Zool Fenn 24:227–237

    Google Scholar 

  • Burnham KP, Anderson DR (2002) Model selection and inference: A practical information-theoretic approach. Springer, New York

    Google Scholar 

  • Burns KC (2007) Patterns in the assembly of an island plant community. J Biogeogr 34:760–768

    Article  Google Scholar 

  • Cameron RAD (1992) Land snail faunas of the Napier and Oscar Ranges, Western Australia; diversity, distribution and speciation. Biol J Linn Soc 45:271–286

    Article  Google Scholar 

  • Colwell RK, Winkler DW (1984) A null model for null models in biogeography. In: Strong Jr, Simberloff D, Abele LG, Thistle AB (eds) Ecological communities: conceptual issues and the evidence. Princeton University Press, Princeton, pp 344–359

    Google Scholar 

  • Connor EF, Simberloff D (1979) The assembly of species communities: chance or competition? Ecology 60:1132–1140

    Article  Google Scholar 

  • Crowe TM (1979) Lots of weeds. J Biogeogr 6:169–181

    Article  Google Scholar 

  • Descimon H (1986) Origins of Lepidopteran faunas in the high tropical Andes. In: Vuilleumier F, Monasterio M (eds) High altitude tropical biogeography. Oxford University Press, Oxford, pp 500–532

    Google Scholar 

  • Diamond JM (1975) Assembly of species communities. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 342–444

    Google Scholar 

  • Diamond JM, Gilpin ME (1982) Examination of the “null” model of Connor and Simberloff for species co-occurrences on islands. Oecologia 52:64–74

    Article  Google Scholar 

  • Diamond JM, Marshall AG (1976) Origin of the New Hebridean avifauna. Emu 76:187–200

    Google Scholar 

  • Efron B (2005) Bayesians, frequentists, and scientists. J Am Stat Assoc 100:1–5

    Article  CAS  Google Scholar 

  • Gotelli NJ (2000) Null model analysis of species co-occurrence patterns. Ecology 81:2606–2621

    Google Scholar 

  • Gotelli NJ (2001) Research frontiers in null model analysis. Glob Ecol Biogeogr 10:337–343

    Article  Google Scholar 

  • Gotelli NJ, Ellison AE (2004) A primer of ecological statistics. Sinauer, Sunderland

    Google Scholar 

  • Gotelli NJ, Entsminger GL (2001) Swap and fill algorithms in null model analysis: rethinking the knight’s tour. Oecologia 129:281–291

    Article  Google Scholar 

  • Gotelli NJ, Entsminger GL (2003) Swap algorithms in null model analysis. Ecology 84:532–535

    Article  Google Scholar 

  • Gotelli NJ, Graves GR (1996) Null models In ecology. Smithsonian Institution Press, Washington

  • Gotelli NJ, McCabe DJ (2002) Species co-occurrence: a meta-analysis of J. M. Diamond’s assembly rules model. Ecology 83:2091–2096

    Article  Google Scholar 

  • Haila Y, Järvinen O, Vaisanen RA (1980) Habitat distributions and species associations of land bird populations on the Aland Islands, SW Finland. Ann Zool Fenn 17:87–106

    Google Scholar 

  • Hatt RT, Van Tyne J, Stuart LC, Pope CH, Grobman AB (1948) Island life: a study of the land vertebrates of the islands of eastern Lake Michigan. Cranbrook Institute of Science. Bloomfield Hills, MI

    Google Scholar 

  • Higgins CL, Willig MR, Strauss RE (2006) The role of stochastic processes in producing nested patterns of species distributions. Oikos 114:159–167

    Article  Google Scholar 

  • Hocutt CH, Denoncourt RF, Stauffer JR (1978) Fishes of the Greenbriar River, West Virginia, with drainage history of the Central Appalachians. J Biogeogr 5:59–80

    Article  Google Scholar 

  • Kammenga JE, Herman MA, Ouborg NJ, Johnson L, Breitling R (2007) Microarray challenges in ecology. Trends Ecol Evol 22:273–279

    Article  PubMed  Google Scholar 

  • Lehsten V, Harmand P (2006) Null models for species co-occurrence patterns: assessing bias and minimum iteration number for the sequential swap. Ecography 29:786–792

    Article  Google Scholar 

  • Manly BFJ (1991) Randomization and Monte Carlo methods in biology. Chapman and Hall, London

    Google Scholar 

  • Manly BFJ (1995) A note on the analysis of species co-occurrences. Ecology 76:1109–1115

    Article  Google Scholar 

  • May RM (1975) Patterns of species abundance and diversity. In: Cody ML, Diamond JM (eds) Ecology and evolution of communities. Harvard University Press, Cambridge, pp 81–120

    Google Scholar 

  • McCoy ED, Heck KL (1987) Some observations on the use of taxonomic similarity in large-scale biogeography. J Biogeogr 14:79–87

    Article  Google Scholar 

  • Miklós I, Podani J (2004) Randomization of presence–absence matrices: comments and new algorithms. Ecology 85:86–92

    Article  Google Scholar 

  • Moran MD (2003) Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 100:403–405

    Article  Google Scholar 

  • Murphy RW (1983) The reptiles: origins and evolution. In: Case TJ, Cody ML (eds) Island biogeography in the Sea of Cortez. University of California Press, Berkeley, pp 130–158

    Google Scholar 

  • Patterson BD (1987) The principle of nested subsets and its implications for biological conservation. Conserv Biol 1:323–334

    Article  Google Scholar 

  • Patterson BD, Atmar W (1986) Nested subsets and the structure of insular mammalian faunas and archipelagos. In: Heaney LR, Patterson BD (eds) Island biogeography of mammals. Academic Press, London, pp 65–82

    Google Scholar 

  • Patterson BD, Pacheco V, Solari S (1996) Distributions of bats along an elevational gradient in the Andes of south-eastern Peru. J Zool 240:637–658

    Article  Google Scholar 

  • Sanderson JG (2000) Testing ecological patterns. Am Sci 88:332–339

    Google Scholar 

  • Sanderson JG (2004) Null model analysis of communities on gradients. J Biogeogr 31:879–883

    Article  Google Scholar 

  • Schluter D, Grant PR (1984) Determinants of morphological patterns in communities of Darwin’s finches. Am Nat 123:175–196

    Article  Google Scholar 

  • Sfenthourakis S, Giokas S, Tzanatos E (2004) From sampling stations to archipelagos: investigating aspects of the assemblage of insular biota. Glob Ecol Biogeogr 13:23–35

    Article  Google Scholar 

  • Sfenthourakis S, Tzanatos E, Giokas S (2006) Species co-occurrence: the case of congeneric species and a causal approach to patterns of species association. Glob Ecol Biogeogr 15:39–49

    Article  Google Scholar 

  • Shipley B (2002) Cause and correlation in biology: a user’s guide to path analysis, structural equations and causal inference. Cambridge University Press, Cambridge

    Google Scholar 

  • Simberloff D, Connor EF (1979) Q-mode and R-mode analyses of biogeographic distributions: null hypotheses based on random colonization. In: Patil GP, Rosenzweig ML (eds) Contemporary quantitative ecology and related ecometrics. International Cooperative Publishing House, Fairland, pp 123–138

    Google Scholar 

  • Simberloff D, Connor EF (1981) Missing species combinations. Am Nat 118:215–239

    Article  Google Scholar 

  • Springer VG (1982) Pacific plate biogeography, with special reference to shorefishes. Smithsonian Institution, Washington

    Google Scholar 

  • Stone L, Roberts A (1990) The checkerboard score and species distributions. Oecologia 85:74–79

    Article  Google Scholar 

  • Sutherland JP, Karlson RH (1977) Development and stability of the fouling community at Beaufort, North Carolina. Ecol Monogr 47:425–446

    Article  Google Scholar 

  • Ulrich W (2004) Species co-occurrences and neutral models: reassessing J. M. Diamond’s assembly rules. Oikos 107:603–609

    Article  Google Scholar 

  • Ulrich W (2008) Pairs—a FORTRAN program for studying pair-wise species associations in ecological matrices. www.uni.torun.pl/~ulrichw

  • Ulrich W, Gotelli NJ (2007a) Null model analysis of species nestedness patterns. Ecology 88:1824–1831

    Article  PubMed  Google Scholar 

  • Ulrich W, Gotelli NJ (2007b) Disentangling community patterns of nestedness and species co-occurrence. Oikos 116:2053–2061

    Article  Google Scholar 

  • Ulrich W, Zalewski M (2006) Abundance and co-occurrence patterns of core and satellite species of ground beetles on small lake islands. Oikos 114:338–348

    Article  Google Scholar 

  • Veech JA (2006) A probability-based analysis of temporal and spatial co-occurrence in grassland birds. J Biogeogr 33:2145–2153

    Article  Google Scholar 

  • Zaman A, Simberloff D (2002) Random binary matrices in biogeographical ecology—instituting a good neighbor policy. Environ Ecol Stat 9:405–421

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by a grant from the Polish Ministry of Science to W. U. (KBN, 2 P04F 039 29). N. J. G. was supported by NSF grant 0541936. We thank Aaron Ellison for calling our attention to Efron (2005). The manuscript was improved by comments from two anonymous reviewers and associate editor W. M. Mooij.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicholas J. Gotelli.

Additional information

Communicated by Wolf Mooij.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(XLS 558 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gotelli, N.J., Ulrich, W. The empirical Bayes approach as a tool to identify non-random species associations. Oecologia 162, 463–477 (2010). https://doi.org/10.1007/s00442-009-1474-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00442-009-1474-y

Keywords

Navigation