Skip to main content

Statistical Models and Analysis of Microbiome Data from Mice and Humans

  • Chapter
  • First Online:

Part of the book series: Physiology in Health and Disease ((PIHD))

Abstract

After the initiation of the Human Microbiome Project in 2007, numerous statistical and bioinformatic tools for data analysis and computational methods were developed and applied to meet the needs of microbiome studies. One of the popular platforms is to implement the newly developed statistical and bioinformatic methods and models using R packages.

In this chapter, we introduce the widely used and newly developed statistical methods and models in the ecology and microbiome fields. We show readers how to use the current available statistical tools based on the R programming language to analyze microbiome data. Our purpose is to provide the analytical steps and tools to be implemented by microbiome researchers, who may not have advanced knowledge of statistical models and R programming language. Specifically, this chapter covers frequently used univariate and multivariate statistical models and visualization tools, in addition to alpha and beta metrics and R programming skills, using real data from mouse and human microbiome studies.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B (Methodological) 44(2):139–177

    Google Scholar 

  • Borcard D, Gillet F et al (2011) Numerical ecology with R. Springer, New York

    Book  Google Scholar 

  • Chao A (1984) Nonparametric estimation of the number of classes in a population. Scand J Stat 11:265–270

    Google Scholar 

  • Charlson ES, Chen J et al (2010) Disordered microbial communities in the upper respiratory tract of cigarette smokers. PLoS One 5(12):0015216

    Article  Google Scholar 

  • Chen J (2012) GUniFrac: generalized UniFrac distances. R package version 1.0. http://CRAN.R-project.org/package=GUniFrac

  • Clarke KR (1993) Non-parametric multivariate analysis of changes in community structure. Aust J Ecol 18:117–143

    Article  Google Scholar 

  • Fernandes AD, Macklaim JM et al (2013) ANOVA-like differential expression (ALDEx) analysis for mixed population RNA-Seq. PLoS One 8(7):e67019

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gloor GB, Reid G (2016) Compositional analysis: a valid approach to analyze microbiome high-throughput sequencing data. Can J Microbiol 62(8):692–703

    Article  CAS  PubMed  Google Scholar 

  • Gloor GB, Wu JR et al (2016) It’s all relative: analyzing microbiome data as compositions. Ann Epidemiol 26(5):322–329

    Article  PubMed  Google Scholar 

  • Jin D, Wu S et al (2015) Lack of vitamin D receptor causes dysbiosis and changes the functions of the murine intestinal microbiome. Clin Ther 37(5):996–1009.e1007

    Article  CAS  PubMed  Google Scholar 

  • Kindt R, Coe R (2005) Tree diversity analysis. A manual and software for common statistical methods for ecological and biodiversity studies. World Agroforestry Centre (ICRAF), Nairobi. ISBN: 92-9059-179-X

    Google Scholar 

  • Mandal S, Van Treuren W et al (2015) Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 26:27663

    PubMed  Google Scholar 

  • Oksanen J, Guillaume Blanchet F et al (2016) Vegan: community ecology package. R package version 2.4-1. http://CRAN.R-project.org/package=vegan

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/

  • RStudio Team (2016) RStudio: integrated development for R. RStudio, Boston. http://www.rstudio.com/

  • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    Article  Google Scholar 

  • Shannon CE, Weaver W (1949) The mathematical theory of communication. University of Illinois Press, Urbana

    Google Scholar 

  • Simpson EH (1949) Measurement of diversity. Nature 163:688

    Article  Google Scholar 

  • Wang J, Thingholm LB et al (2016) Genome-wide association analysis identifies variation in vitamin D receptor and other host factors influencing the gut microbiota. Nat Genet 48(11):1396–1406

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wickham H, Francois R (2016). dplyr: a grammar of data manipulation. R package version 0.5.0. http://CRAN.R-project.org/package=dplyr

  • Xia Y, Sun J (2017) Hypothesis testing and statistical analysis of microbiome. Genes Dis 4(3):138–148. https://doi.org/10.1016/j.gendis.2017.06.001

    Article  Google Scholar 

Download references

Acknowledgments

We would like to acknowledge the NIDDK/National Institutes of Health grant R01 DK105118 and DOD BC160450P1 to Jun Sun. We thank the two anonymous reviewers whose comments/suggestions helped to improve and clarify this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yinglin Xia or Jun Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 The American Physiological Society

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Xia, Y., Sun, J. (2018). Statistical Models and Analysis of Microbiome Data from Mice and Humans. In: Sun, J., Dudeja, P. (eds) Mechanisms Underlying Host-Microbiome Interactions in Pathophysiology of Human Diseases. Physiology in Health and Disease. Springer, Boston, MA. https://doi.org/10.1007/978-1-4939-7534-1_12

Download citation

Publish with us

Policies and ethics