Hierarchical Modeling and Differential Expression Analysis for RNA-seq Experiments with Inbred and Hybrid Genotypes

Lithio, Andrew; Nettleton, Dan

doi:10.1007/s13253-015-0232-3

Hierarchical Modeling and Differential Expression Analysis for RNA-seq Experiments with Inbred and Hybrid Genotypes

Published: 05 October 2015

Volume 20, pages 598–613, (2015)
Cite this article

Journal of Agricultural, Biological, and Environmental Statistics Aims and scope Submit manuscript

Andrew Lithio¹ &
Dan Nettleton¹

362 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The performance of inbred and hybrid genotypes is of interest in plant breeding and genetics. High-throughput sequencing of RNA (RNA-seq) has proven to be a useful tool in the study of the molecular genetic responses of inbreds and hybrids to environmental stresses. Commonly used experimental designs and sequencing methods lead to complex data structures that require careful attention in data analysis. We demonstrate an analysis of RNA-seq data from a split-plot design involving drought stress applied to two inbred genotypes and two hybrids formed by crosses between the inbreds. Our generalized linear modeling strategy incorporates random effects for whole-plot experimental units and uses negative binomial distributions to allow for overdispersion in count responses for split-plot experimental units. Variations in gene length and base content, as well as differences in sequencing intensity across experimental units, are also accounted for. Hierarchical modeling with thoughtful parameterization and prior specification allows for borrowing of information across genes to improve estimation of dispersion parameters, genotype effects, treatment effects, and interaction effects of primary interest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical Bayes Analysis of RNA-seq Data for Detection of Gene Expression Heterosis

Article 26 October 2015

A Semi-parametric Bayesian Approach for Differential Expression Analysis of RNA-seq Data

Article 07 October 2015

Substantial contribution of genetic variation in the expression of transcription factors to phenotypic variation revealed by eRD-GWAS

Article Open access 17 October 2017

References

Anders, S., and Huber, W. (2010), “Differential expression analysis for sequence count data,” Genome Biol, 11(10), R106.
Benjamini, Y., and Speed, T. P. (2012), “Summarizing and correcting the GC content bias in high-throughput sequencing,” Nucleic Acids Research, p. gks001.
Dillies, M.-A., Rau, A., Aubert, J., Hennequet-Antier, C., Jeanmougin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J. et al. (2013), “A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis,” Briefings in Bioinformatics, 14(6), 671–683.
Hardcastle, T. J., and Kelly, K. A. (2010), “baySeq: empirical Bayesian methods for identifying differential expression in sequence count data,” BMC Bioinformatics, 11(1), 422.
Law, C. W., Chen, Y., Shi, W., and Smyth, G. K. (2014), “Voom: precision weights unlock linear model analysis tools for RNA-seq read counts,” Genome Biol, 15(2), R29.
Lewin, A., Bochkina, N., and Richardson, S. (2007), “Fully Bayesian mixture model for differential gene expression: simulations and model checks,” Statistical Applications in Genetics and Molecular Biology, 6(1).
Li, J., and Tibshirani, R. (2013), “Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-seq data,” Statistical Methods in Medical Research, 22(5), 519–536.
Lorenz, D. J., Gill, R. S., Mitra, R., and Datta, S. (2014), “Using RNA-seq Data to Detect Differentially Expressed Genes,” in Statistical Analysis of Next Generation Sequencing Data Springer, pp. 25–49.
Love, M. I., Huber, W., and Anders, S. (2014), “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2,” Genome Biol, 15(12), 550.
Lund, S. P., Nettleton, D. et al. (2012), “The importance of distinct modeling strategies for gene and gene-specific treatment effects in hierarchical models for microarray data,” The Annals of Applied Statistics, 6(3), 1118–1133.
McCarthy, D. J., Chen, Y., and Smyth, G. K. (2012), “Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation,” Nucleic Acids Research, 40(10), 4288–4297.
Nettleton, D. (2014), “Design of RNA Sequencing Experiments,” in Statistical Analysis of Next Generation Sequencing Data Springer, pp. 93–113.
Oshlack, A., Wakefield, M. J. et al. (2009), “Transcript length bias in RNA-seq data confounds systems biology,” Biol Direct, 4(1), 14.
Riebler, A., Robinson, M. D., and van de Wiel, M. A. (2014), “Analysis of Next Generation Sequencing Data Using Integrated Nested Laplace Approximation (INLA),” in Statistical Analysis of Next Generation Sequencing Data Springer, pp. 75–91.
Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2010), “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data,” Bioinformatics, 26(1), 139–140.
Robinson, M. D., Oshlack, A. et al. (2010), “A scaling normalization method for differential expression analysis of RNA-seq data,” Genome Biol, 11(3), R25.
Rue, H., Martino, S., and Chopin, N. (2009), “Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 319–392.
Smyth, G. K. (2005), “Limma: linear models for microarray data,” in Bioinformatics and computational biology solutions using R and Bioconductor Springer, pp. 397–420.
van de Wiel, M. A., Leday, G. G., Pardo, L., Rue, H., Van Der Vaart, A. W., and Van Wieringen, W. N. (2013), “Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors,” Biostatistics, 14(1), 113–128.
Ventrucci, M., Scott, E. M., and Cocchi, D. (2011), “Multiple testing on standardized mortality ratios: a Bayesian hierarchical model for FDR estimation,” Biostatistics, 12(1), 51–67.

Download references

Acknowledgments

Research reported in this chapter was supported by the National Institute of General Medical Sciences (NIGMS) of the National Institutes of Health and the joint National Science Foundation/NIGMS Mathematical Biology Program under award number R01GM109458. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health or the National Science Foundation.

Author information

Authors and Affiliations

Department of Statistics, Iowa State University, Ames, IA, 50011, USA
Andrew Lithio & Dan Nettleton

Authors

Andrew Lithio
View author publications
You can also search for this author in PubMed Google Scholar
Dan Nettleton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Lithio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lithio, A., Nettleton, D. Hierarchical Modeling and Differential Expression Analysis for RNA-seq Experiments with Inbred and Hybrid Genotypes. JABES 20, 598–613 (2015). https://doi.org/10.1007/s13253-015-0232-3

Download citation

Received: 18 July 2015
Accepted: 25 September 2015
Published: 05 October 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s13253-015-0232-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Modeling and Differential Expression Analysis for RNA-seq Experiments with Inbred and Hybrid Genotypes

Abstract

Access this article

Similar content being viewed by others

Empirical Bayes Analysis of RNA-seq Data for Detection of Gene Expression Heterosis

A Semi-parametric Bayesian Approach for Differential Expression Analysis of RNA-seq Data

Substantial contribution of genetic variation in the expression of transcription factors to phenotypic variation revealed by eRD-GWAS

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical Modeling and Differential Expression Analysis for RNA-seq Experiments with Inbred and Hybrid Genotypes

Abstract

Access this article

Similar content being viewed by others

Empirical Bayes Analysis of RNA-seq Data for Detection of Gene Expression Heterosis

A Semi-parametric Bayesian Approach for Differential Expression Analysis of RNA-seq Data

Substantial contribution of genetic variation in the expression of transcription factors to phenotypic variation revealed by eRD-GWAS

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation